Attention-based Neural Machine Translation with LSTM

An Implementation of the Encoder-Decoder model with global attention mechanism (Luong et al., 2015). This stacked multiple layers of an RNN with a Long Short-Term Memory (LSTM) are used for both the encoder and the decoder. Also, the global attention mechanism and input feeding approach are employed. In the training step, you can use schedule sampling (Bengio et al., 2015) to bridge the gap between training and inference for sequence prediction tasks.

Usages

Environment setup

You need to install Python version 3.9

Create virtual environment with:

python3 -m venv deeptrans-env

Activate virtual environment with:

. ./deeptrans-env/bin/activate

Install required dependencies with:

pip install -r requirements.txt

Preprocessing Data

Put slowar.xml file into APP_ROOT/data/ directory and run command to create train(85%) and validation(15%) datasets.

python3 preprocess.py

Training

The --train and --valid options receive the path to a data file for training and validation, respectively. The data file must be tab-separated values (TSV) format. If you need to use GPU, please set the --gpu option. --tf-ratio option means a ratio of the supervised signal in the decoding step.

python3 train.py \
    --gpu \
    --train ./data/train.tsv \
    --valid ./data/valid.tsv \
    --tf-ratio 0.5 \
    --savedir ./checkpoints

Translation

The --model option receives the path to a model file generated by train.py. A text file that you want to translate is given to --input. If you need to use GPU, please set the --gpu option.

python3 translate.py \
    --gpu \
    --model ./checkpoints/checkpoint_best.pt \
    --input ./data/test.txt

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
__pycache__		__pycache__
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
model.py		model.py
options.py		options.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
translate.py		translate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Attention-based Neural Machine Translation with LSTM

Usages

Environment setup

Preprocessing Data

Training

Translation

References

About

Releases

Packages

Languages

License

ostaptan/deeptrans

Folders and files

Latest commit

History

Repository files navigation

Attention-based Neural Machine Translation with LSTM

Usages

Environment setup

Preprocessing Data

Training

Translation

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages