This repository has been archived by the owner. It is now read-only.
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
s2sa
LICENSE
README.md
convert_to_cpu.lua
evaluate.lua
pred.txt
preprocess-shards.py
preprocess.py
prune.lua
start.sh
train.lua

README.md

Seq2seq approach to CMPT-413 final project

This approach is based by Effective Approaches to Attention-based Neural Machine Translation, Luong et al. EMNLP 2015.

We utlize code from

Dependencies

Torch

Get Torch ready on your system.

Torch Libraries

luarocks install hdf5
luarocks install nn
luarocks install nngraph

If you are going to train it by yourself

luarocks install cutorch
luarocks install cunn
luarocks install cudnn # for cudnn acceleration, not necessary

Python 2.7

Uh, I think everyone should have it installed already.

Python Libraries

pip install h5py numpy

Usage (Play with)

Download pretrained model

Google Drive

run

th evaluate.lua -model demo-model_final.t7 -src_file data/src-val.txt -output_file pred.txt 
-src_dict data/demo.src.dict -targ_dict data/demo.targ.dict

Usage (train your own model)

Split data into train and validation

Make sure validation sentences haven't mixed in Training.

Transform data into hdf5

python preprocess.py --srcfile data/src-train.txt --targetfile data/targ-train.txt
--srcvalfile data/src-val.txt --targetvalfile data/targ-val.txt --outputfile data/demo

Train

th train.lua -data_file data/demo-train.hdf5 -val_data_file data/demo-val.hdf5 -savefile demo-model \
-gpuid 0 -cudnn 1 -num_layers 4 -rnn_size 100