Encoder-Decoder architecture for neural machine translation BN->EN
.
- Both LSTM and GRU can be used as Encoder or Decoder.
- Scratch implementation of vocab
language.py
and trainertrainer.py
class. - Implementation of Bahdanau attention decoder.
Install the following if not installed.
- python 3.x
- torch cuda version
- scarceblue
- configparser
- Keep your preprocess data in the
data
folder check out thesample.txt
for data format. - Change the appropriate variable inside
experiment.ini
i.e.lang1, rnn, hidden_size
. reverse
abool
variable will change the model training i.e.BN->EN
toEN->BN
- Training:
python train.py
For resume training change the obj_path
from false
to logs
and checkpoint
false
to model path i.e. path/to/model.pt
.
For train in new language you only need to change the tokenizer
in train.py
. You can use spacy
tokenizer or custom made tokernizer.