NTU Machine Learing 2017 Fall Project: Listen and Translate

This is a project of Machine Learning at NTU. Given a Taiwanese audio signal, select the most possible Chinese translations from the given options. We implemented Seq2Seq model and Retrieval model on this task and ranked No.2 on the Kaggel competition website.

Getting Started

The following instructions will get you a copy of the project and running on your local machine for testing purposes.

Prerequisite & Toolkits

The following are some toolkits and their version you need to install for running this project

In addition, it is required to use GPU to run this project.

Running the tests

The following are some instructions to reproduce the results of the model.


To run the project, first clone the project and go into the folder

git clone ...
cd Listen-and-Translate/

Training/Testing Dataset

We used the dataset on Kaggle as our training/testing dataset, and we did some preprocessing on these data, the preprocessed data are stored in the data folder.


We only provide the pre-trained Retrieval Model for reproducing the results since we discovered that retrieval model performs better accuracy on this tasks. The model is stored in the model folder.

Source Codes & Report

To see our model structures, you can find the code in the src folder and further explainations in our Report.pdf.

Reproducing results

To reproduce the results, please provide the test.csv file path as argument and run the script file. The reproduced csv file will output into the prediction folder.

bash <test.csv file path>

Kaggle Competition Website

Our model ranked No.2 on public set and No.3 on private set. This is the Kaggle Competition Website.


[1] The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems Ryan Lowe el at.(2016)
[2] Attention-Based Models for Speech Recognition Jan Chorowski el at.(2015)


Machine Learning Project 2017 Fall @ NTU




