Seq2seq code in PyTorch

Building from Ruotian Luo's code for captioning AND Sandeep Subramanian's seq2seq code

Data preprocessing:

I use these steps from Alexandre Bérard's code

> config/WMT14/download.sh    # download WMT14 data into raw_data/WMT14
> config/WMT14/prepare.sh     # preprocess the data, and copy the files to data/WMT14

Then run the following to save in h5 files:

> python scripts/prepro_text.py

Training:

Training requires some directories for saving the model's snapshots, the tensorboard events

> mkdir -p save events

To train a model under the parameters defined in config.yaml

> python nmt.py -c config.yaml

Check options/opts.py for more about the options.

To evaluate a model:

> python eval.py -c config

To submit jobs via OAR use either train.sh or select_train.sh

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
config		config
loader		loader
loss		loss
models		models
options		options
results		results
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
RUNNING		RUNNING
TODO		TODO
eval.py		eval.py
follow.sh		follow.sh
lig.sh		lig.sh
nmt.py		nmt.py
requirements.txt		requirements.txt
select_train.sh		select_train.sh
tmp.py		tmp.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seq2seq code in PyTorch

Data preprocessing:

Training:

About

Releases

Packages

Languages

elbayadm/seq2seq

Folders and files

Latest commit

History

Repository files navigation

Seq2seq code in PyTorch

Data preprocessing:

Training:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages