Parallel Corpus for Typology
This repository contains automatically aligned subtitles of films in different languages. The subtitles are aligned with the English version with the help of Jörg Tiedemann's software subalign. Some alignments have been manually checked, but this work is still in progress.
If you use these data in your research, please cite the following paper:
Levshina, Natalia. 2016. Verbs of letting in Germanic and Romance languages: A quantitative investigation based on a parallel corpus of film subtitles. Languages in Contrast 16(1): 84-117.
In addition, the folder "Originals" contains the original subtitles downloaded from opensubtitles.org in English, Chinese, Latvian and Lithuanian for specific projects.