Automatically classifying translation techniques
The code and usage manual is being completed, please come back to check them later.
Automatically classify translation techniques based on manually annotated examples from an English-French parallel corpus of TED Talks.
This project provides the code and dataset for these two papers:
-
Towards Recognizing Phrase Translation Processes: Experiments on English-French
Yuming Zhai, Pooyan Safari, Gabriel Illouz, Alexandre Allauzen, and Anne Vilnat
20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLING), 2019 (preprint version) -
Classification automatique des procédés de traduction
Yuming Zhai, Gabriel Illouz and Anne Vilnat
26ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), 2019
Dependencies needed:
-
Python version 3.0 or higher
-
Pytorch version 0.4.1
-
FastText version 0.2.0
-
'wordfreq', a Python 3 library, it can tokenize multilingual text consistently: https://pypi.org/project/wordfreq/