# Neural Machine Translation with TensorFlow

[The TensorFlow team recently released a tutorial](https://github.com/tensorflow/nmt) on [neural machine translation](https://en.wikipedia.org/wiki/Neural_machine_translation). Their tutorial shows off some of the functionality in the [TensorFlow seq2seq library](https://www.tensorflow.org/api_guides/python/contrib.seq2seq).

The code presented in the tutorial is "lightweight, high-quality, production-ready, and incorporated with the latest research ideas." With a pitch like that, how could we not be interested?

This notebook will show you how to work with that model in Datalab.

## Getting access to the code

To begin with, we need to get their code from the [tensorflow/nmt](https://github.com/tensorflow/nmt) repository to the persistent disk attached to the [GCE instance](https://cloud.google.com/compute/docs/instances/) hosting this notebook.

Fortunately, any functionality that's available in a Jupyter notebooks is also available in Datalab. That means that we can access a shell on the instance through the notebook using the `!` symbol:

Let us clone [tensorflow/nmt](https://github.com/tensorflow/nmt) into the directory specified as below:

In [1]:
TFNMT_DIR = '/tmp/tf-nmt'

In [2]:
!git clone https://github.com/tensorflow/nmt $TFNMT_DIR

Cloning into '/tmp/tf-nmt'...
remote: Counting objects: 694, done.[K
remote: Compressing objects: 100% (20/20), done.[K
remote: Total 694 (delta 9), reused 7 (delta 3), pack-reused 671[K
Receiving objects: 100% (694/694), 940.15 KiB | 2.84 MiB/s, done.
Resolving deltas: 100% (458/458), done.


In order to run the code in that tutorial as-is from this notebook, we have to change the working directory to `tf-nmt`. This is done as follows:

## Running the translation models

Let us being with a modest goal: to get their pre-defined models working in this environment.

Once we have managed to do so, we can start thinking about messing with the guts of their models and defining our own.

We will do the following things:

1. Obtain training and test data

2. Train an inattentive model defined by tensorflow/nmt

3. Use the model we trained to perform inference

### Getting data

Fortunately, the tutorial authors have provided a script we can use to download the training data:

In [3]:
!cat $TFNMT_DIR/nmt/scripts/download_iwslt15.sh

#!/bin/sh
# Download small-scale IWSLT15 Vietnames to English translation data for NMT
# model training.
#
# Usage:
#   ./download_iwslt15.sh path-to-output-dir
#
# If output directory is not specified, "./iwslt15" will be used as the default
# output directory.
OUT_DIR="${1:-iwslt15}"
SITE_PREFIX="https://nlp.stanford.edu/projects/nmt/data"

mkdir -v -p $OUT_DIR

# Download iwslt15 small dataset from standford website.
echo "Download training dataset train.en and train.vi."
curl -o "$OUT_DIR/train.en" "$SITE_PREFIX/iwslt15.en-vi/train.en"
curl -o "$OUT_DIR/train.vi" "$SITE_PREFIX/iwslt15.en-vi/train.vi"

echo "Download dev dataset tst2012.en and tst2012.vi."
curl -o "$OUT_DIR/tst2012.en" "$SITE_PREFIX/iwslt15.en-vi/tst2012.en"
curl -o "$OUT_DIR/tst2012.vi" "$SITE_PREFIX/iwslt15.en-vi/tst2012.vi"

echo "Download test dataset tst2013.en and tst2013.vi."
curl -o "$OUT_DIR/tst2013.en" "$SITE_PREFIX/iwslt15.en-vi/tst2013.en"
curl -o "$OUT_DIR/tst2013.vi" "$SITE_PRE

Let us make a directory in which to store this data and then use the script to download it there:

In [4]:
DATA_DIR = '{}/data'.format(TFNMT_DIR)

In [5]:
!$TFNMT_DIR/nmt/scripts/download_iwslt15.sh $DATA_DIR

mkdir: created directory ‘/tmp/tf-nmt/data’
Download training dataset train.en and train.vi.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12.9M  100 12.9M    0     0  45.2M      0 --:--:-- --:--:-- --:--:-- 45.3M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 17.2M  100 17.2M    0     0  67.5M      0 --:--:-- --:--:-- --:--:-- 67.5M
Download dev dataset tst2012.en and tst2012.vi.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  136k  100  136k    0     0  2832k      0 --:--:-- --:--:-- --:--:-- 2853k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Le

In [6]:
!ls $DATA_DIR

train.en  tst2012.en  tst2013.en  vocab.en
train.vi  tst2012.vi  tst2013.vi  vocab.vi


We now have the data we need to continue.

### Training

We begin by designating a directory into which TensorFlow can store the model checkpoints:

In [7]:
MODEL_DIR = '{}/model'.format(TFNMT_DIR)

Now we can run the training job as indicated in the NMT tutorial, with only a few modifications:

In [8]:
cd $TFNMT_DIR

/tmp/tf-nmt


In [None]:
!python -m nmt.nmt \
    --src=vi --tgt=en \
    --vocab_prefix=$DATA_DIR/vocab  \
    --train_prefix=$DATA_DIR/train \
    --dev_prefix=$DATA_DIR/tst2012  \
    --test_prefix=$DATA_DIR/tst2013 \
    --out_dir=$MODEL_DIR \
    --num_train_steps=12000 \
    --steps_per_stats=100 \
    --num_layers=2 \
    --num_units=128 \
    --dropout=0.2 \
    --metrics=bleu

# Job id 0
# Loading hparams from /tmp/tf-nmt/model/hparams
  saving hparams to /tmp/tf-nmt/model/hparams
  saving hparams to /tmp/tf-nmt/model/best_bleu/hparams
  attention=
  attention_architecture=standard
  batch_size=128
  beam_width=0
  best_bleu=0
  best_bleu_dir=/tmp/tf-nmt/model/best_bleu
  bpe_delimiter=None
  check_special_token=True
  colocate_gradients_with_ops=True
  decay_factor=0.98
  decay_steps=10000
  dev_prefix=/tmp/tf-nmt/data/tst2012
  dropout=0.2
  encoder_type=uni
  eos=</s>
  epoch_step=0
  forget_bias=1.0
  infer_batch_size=32
  init_op=uniform
  init_weight=0.1
  learning_rate=1.0
  learning_rate_warmup_factor=1.0
  learning_rate_warmup_steps=0
  length_penalty_weight=0.0
  log_device_placement=False
  max_gradient_norm=5.0
  max_train=0
  metrics=[u'bleu']
  num_buckets=5
  num_embeddings_partitions=0
  num_gpus=1
  num_layers=2
  num_residual_layers=0
  num_train_steps=12000
  num_units=128
  optimizer=sgd
  out_dir=/tmp/tf-nmt/model
  pass_hidden_state=Tru

  global step 900 lr 1 step-time 0.83s wps 6.75K ppl 155.35 bleu 0.00
  global step 1000 lr 1 step-time 0.83s wps 6.69K ppl 142.41 bleu 0.00
# Save eval, global step 1000
2017-09-14 10:42:19.707003: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.vi is already initialized.
2017-09-14 10:42:19.707434: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.en is already initialized.
2017-09-14 10:42:19.707465: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.en is already initialized.
  loaded infer model parameters from /tmp/tf-nmt/model/translate.ckpt-1000, time 0.04s
  # 317
    src: Chúng ta cần tìm cách để đối đầu với những thách thức , những vấn đề và cả nỗi khổ mà sự bất công gây ra
    ref: We need to find ways to embrace these challenges , these problems , the suffering .
    nmt: We &apos

  global step 2900 lr 1 step-time 0.80s wps 6.91K ppl 74.42 bleu 1.06
  global step 3000 lr 1 step-time 0.82s wps 6.95K ppl 76.33 bleu 1.06
# Save eval, global step 3000
2017-09-14 11:10:44.404364: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.vi is already initialized.
2017-09-14 11:10:44.404908: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.en is already initialized.
2017-09-14 11:10:44.404908: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.en is already initialized.
  loaded infer model parameters from /tmp/tf-nmt/model/translate.ckpt-3000, time 0.04s
  # 420
    src: Thực ra Ông thu thập các loại cá .
    ref: He was actually collecting fish .
    nmt: <unk> are <unk> .
2017-09-14 11:10:44.447406: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize f

2017-09-14 11:39:14.812540: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.vi is already initialized.
2017-09-14 11:39:14.812891: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.en is already initialized.
  loaded eval model parameters from /tmp/tf-nmt/model/translate.ckpt-5000, time 0.04s
  eval dev: perplexity 47.21, time 3s, Thu Sep 14 11:39:18 2017.
  eval test: perplexity 55.43, time 3s, Thu Sep 14 11:39:22 2017.
2017-09-14 11:39:22.649389: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.vi is already initialized.
2017-09-14 11:39:22.649507: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/vocab.en is already initialized.
2017-09-14 11:39:22.649688: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to init

  done, num sentences 1553, time 8s, Thu Sep 14 11:57:35 2017.
  bleu dev: 4.7
  saving hparams to /tmp/tf-nmt/model/hparams
# External evaluation, global step 6000
  decoding to output /tmp/tf-nmt/model/output_test.
  done, num sentences 1268, time 7s, Thu Sep 14 11:57:42 2017.
  bleu test: 4.3
  saving hparams to /tmp/tf-nmt/model/hparams
  global step 6300 lr 1 step-time 0.84s wps 6.53K ppl 47.69 bleu 4.68
  global step 6400 lr 1 step-time 0.82s wps 6.85K ppl 46.49 bleu 4.68
  global step 6500 lr 1 step-time 0.84s wps 6.71K ppl 46.52 bleu 4.68
  global step 6600 lr 1 step-time 0.83s wps 6.87K ppl 46.79 bleu 4.68
  global step 6700 lr 1 step-time 0.82s wps 6.87K ppl 46.13 bleu 4.68
  global step 6800 lr 1 step-time 0.82s wps 6.86K ppl 45.68 bleu 4.68
  global step 6900 lr 1 step-time 0.81s wps 6.87K ppl 46.00 bleu 4.68
  global step 7000 lr 1 step-time 0.81s wps 6.84K ppl 45.30 bleu 4.68
# Save eval, global step 7000
2017-09-14 12:07:57.290646: W tensorflow/core/kernels/lookup_table_

  done, num sentences 1553, time 8s, Thu Sep 14 12:27:01 2017.
  bleu dev: 4.7
  saving hparams to /tmp/tf-nmt/model/hparams
# External evaluation, global step 8000
  decoding to output /tmp/tf-nmt/model/output_test.
  done, num sentences 1268, time 8s, Thu Sep 14 12:27:10 2017.
  bleu test: 4.1
  saving hparams to /tmp/tf-nmt/model/hparams
  global step 8400 lr 1 step-time 0.85s wps 6.56K ppl 41.10 bleu 4.69
  global step 8500 lr 1 step-time 0.81s wps 6.85K ppl 38.61 bleu 4.69
  global step 8600 lr 1 step-time 0.82s wps 6.84K ppl 39.63 bleu 4.69
  global step 8700 lr 1 step-time 0.83s wps 6.81K ppl 39.65 bleu 4.69
  global step 8800 lr 1 step-time 0.83s wps 6.88K ppl 39.13 bleu 4.69
  global step 8900 lr 1 step-time 0.81s wps 6.85K ppl 39.26 bleu 4.69
  global step 9000 lr 1 step-time 0.82s wps 6.86K ppl 39.83 bleu 4.69
# Save eval, global step 9000
2017-09-14 12:36:12.966346: W tensorflow/core/kernels/lookup_table_init_op.cc:347] Table trying to initialize from file /tmp/tf-nmt/data/

  done, num sentences 1553, time 9s, Thu Sep 14 12:50:46 2017.
  bleu dev: 5.9
  saving hparams to /tmp/tf-nmt/model/hparams
# External evaluation, global step 10000
  decoding to output /tmp/tf-nmt/model/output_test.
  done, num sentences 1268, time 9s, Thu Sep 14 12:50:56 2017.
  bleu test: 4.9
  saving hparams to /tmp/tf-nmt/model/hparams
  global step 10100 lr 0.98 step-time 0.83s wps 6.78K ppl 37.46 bleu 5.89
  global step 10200 lr 0.98 step-time 0.83s wps 6.80K ppl 37.05 bleu 5.89
  global step 10300 lr 0.98 step-time 0.85s wps 6.72K ppl 37.50 bleu 5.89
