This Github project provides dataset and source code for our paper "Neural Machine Translation for English-Tamil" which was accepted in WMT in EMNLP 2018.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Data perprocessing
MIDAS@translator
README.md
test.en
test.ta
train.en
train.ta
val.en
val.ta

README.md

MIDAS-NMT-English-Tamil

This Github project provides dataset and source code for our paper "Neural Machine Translation for English-Tamil" which was accepted in WMT in EMNLP 2018.

This Repository contains parallel corpus of english to tamil Dataset

train - 183451
test - 2000
val - 1000

follow MIDAS@translator repository for Installation,preporcessing,training and testing

1.installation
2.preprocessing
3.training
4.testing

Kindly cite the below paper if you use our dataset or source code.

@inproceedings{choudhary2018neural,

title={Neural Machine Translation for English-Tamil},

author={Choudhary, Himanshu and Pathak, Aditya Kumar and Shah, Rajiv Ratn and Kumaraguru, Ponnurangam},

booktitle={WMT in EMNLP 2018},

year={2018}

}

follow drive link for train.ta

https://drive.google.com/drive/folders/1sbuu5o1RBvtd1dm5xOsQ8b6y0Ihffkep?usp=sharing