This GitHub contains the official repository (mtdev) for supporting materials and resources for the book "Automating Translation".
Translation technology is essential for translation students, practising translators, and those working as part of the language services industry, but looming above others are the tools for automating translation: machine translation (MT) and, more recently, Generative AI based on Large Language Models (LLMs).
"Automating Translation", authored by leading experts, demystifies MT, explaining its origins, its training data, how neural MT and LLMs work, how to measure their quality, how translators interact with contemporary systems for automating translation, and how readers can build their own MT or LLM. In later chapters, the scope of the book expands to look more broadly at translation automation in audiovisual translation and localisation. Importantly, the book also examines the sociotechnical context, focusing on ethics and sustainability. Enhanced with activities, further reading and resource links, including online support material on the Routledge Translation studies portal, this is an essential textbook for students of translation studies, trainee and practising translators and users of MT and multilingual LLMs.
adaptNMT streamlines all processes involved in the development and deployment of RNN and Transformer neural translation models. As an open-source application, it is designed for both technical and non-technical users who work in the field of machine translation. Built upon the widely-adopted OpenNMT ecosystem, the application is particularly useful for new entrants to the field since the setup of the development environment and creation of train, validation and test splits is greatly simplified. Graphing, embedded within the application, illustrates the progress of model training, and SentencePiece is used for creating subword segmentation models. Hyperparameter customization is facilitated through an intuitive user interface, and a single-click model development approach has been implemented. Models developed by adaptNMT can be evaluated using a range of metrics, and deployed as a translation service within the application. To support eco-friendly research in the NLP space, a green report also flags the power consumption and emissions generated during model development. The application is freely available.
adaptNMT: an open-source, language-agnostic development environment for NMT
The advent of Multilingual Language Models (MLLMs) and Large Language Models (LLMs) has spawned innovation in many areas of natural language processing. Despite the exciting potential of this technology, its impact on developing high-quality Machine Translation (MT) outputs for low-resource languages remains relatively under-explored. Furthermore, an open-source application, dedicated to both fine-tuning MLLM and managing the complete MT workflow, remains unavailable. We aim to address these imbalances through the development of adaptMLLM which streamlines all processes involved in the fine-tuning of MLLMs for MT. As an open-source application, it is designed for developers, translators and end users who work in the field of MT. The application is particularly useful for new entrants to the field since the setup of the development environment is greatly simplified.
adaptMLLM: Fine-Tuning MLLMs on Low-Resource Languages with Integrated LLM Playgrounds