## Check what GPU you got
Click the Runtime dropdown at the top of the page, then Change Runtime Type and confirm the instance type is GPU.

Check the output of !nvidia-smi to make sure you've been allocated a Tesla T4.

In [1]:
!nvidia-smi

Mon Apr 20 01:18:55 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P0    25W / 250W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No ru

## Pre-requisites
Mount the source code and set up Spacy

In [2]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

import os
os.chdir("/content/drive/My Drive/English-to-French-Translation/src")
!ls

!python -m spacy download en_core_web_sm
!python -m spacy download fr_core_news_sm

!pip3 install 'torchtext==0.5.0'

!pip3 install torch torchvision


Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive
dataloader  ml	runner
[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('en_core_web_sm')
Collecting fr_core_news_sm==2.2.5
[?25l  Downloading https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-2.2.5/fr_core_news_sm-2.2.5.tar.gz (14.7MB)
[K     |████████████████████████████████| 14.7MB 1.1MB/s 
Building wheels for collected packages: fr-core-news-sm
  Building wheel for fr-core-news-sm (set

## Build the Vocabs
Make the vocabulary from a dataset

In [3]:
%%shell
DATASET=Multi30k
TRAIN=../data/${DATASET}/Training/
TEST=../data/${DATASET}/Testing/

echo "Making the vocabulary"

python3 dataloader/main.py vocab $TRAIN fr ../models/${DATASET}/vocab.french.gz
python3 dataloader/main.py vocab $TRAIN en ../models/${DATASET}/vocab.english.gz

echo "Finished making the vocabulary"

Making the vocabulary
Building fr vocab from 1 transcriptions
100% 29461/29461 [00:02<00:00, 10166.94it/s]
Built 11137 vocabs
Building en vocab from 1 transcriptions
100% 29461/29461 [00:02<00:00, 13449.45it/s]
Built 9757 vocabs
Finished making the vocabulary




## From English to French
Train the model with attention

In [6]:
%%shell
DATASET=Multi30k
TRAIN=../data/${DATASET}/Training/
TEST=../data/${DATASET}/Testing/

echo "Training the model"

python3 runner/train.py "${TRAIN}" \
    en "../models/${DATASET}/vocab.english.gz" \
    fr "../models/${DATASET}/vocab.french.gz" \
    "../models/${DATASET}/model.en.fr.pt" \
    --source-word-embedding-size 256 \
    --target-word-embedding-size 256 \
    --encoder-num-layers 3 \
    --encoder-num-attention-heads 8 \
    --encoder-pf-size 512 \
    --encoder-dropout 0.1 \
    --decoder-num-layers 3 \
    --decoder-num-attention-heads 8 \
    --decoder-pf-size 512 \
    --decoder-dropout 0.1 \
    --patience 5 \
    --train-val-ratio 0.75 \
    --batch-size 64 \
    --seed 1 \
    --device cuda \
    --resume-from-checkpoint "../models/${DATASET}/checkpoint.en.fr.pt" \
    --save-checkpoint-to "../models/${DATASET}/checkpoint.en.fr.pt" \

echo "Finished training"

Training the model
Loaded 9757 words
Loaded 11137 words
100% 29461/29461 [00:08<00:00, 3453.51it/s]
Number of sentence pairs: 29460
Avg. num words in source text: 13.071554650373388
Std. num words in source text: 4.057199491280985
Max. num words in source text: 41
Min. num words in source text: 4
Avg. num words in target text: 16.263951120162933
Std. num words in target text: 4.727645134343328
Max. num words in target text: 49
Min. num words in target text: 6
Loading from checkpoint
Previous state > Epoch 2: Val loss=3.6991893813527863, num_poor=0
100% 346/346 [00:12<00:00, 27.08it/s]
100% 116/116 [00:25<00:00,  4.60it/s]
Epoch 2: Train loss=3.384525364534014, Val loss=3.0990131271296533, Val BLEU=0.1558600600424488
Saved model
Saved checkpoint
100% 346/346 [00:13<00:00, 26.50it/s]
100% 116/116 [00:24<00:00,  4.73it/s]
Epoch 3: Train loss=2.9093868994299386, Val loss=2.7277140370730697, Val BLEU=0.19385246794524408
Saved model
Saved checkpoint
100% 346/346 [00:13<00:00, 26.40it/s]
100%



Test the model with attention from English to French

In [7]:
%%shell
DATASET=Multi30k
TRAIN=../data/${DATASET}/Training/
TEST=../data/${DATASET}/Testing/

echo "Testing the model with attention from English to French"

python3 runner/test.py $TEST \
    en ../models/${DATASET}/vocab.english.gz \
    fr ../models/${DATASET}/vocab.french.gz \
    ../models/${DATASET}/model.en.fr.pt \
    --source-word-embedding-size 256 \
    --target-word-embedding-size 256 \
    --encoder-num-layers 3 \
    --encoder-num-attention-heads 8 \
    --encoder-pf-size 512 \
    --encoder-dropout 0.1 \
    --decoder-num-layers 3 \
    --decoder-num-attention-heads 8 \
    --decoder-pf-size 512 \
    --decoder-dropout 0.1 \
    --device cuda \
    --batch-size 64 \

echo "Finished testing"

Testing the model with attention from English to French
Loaded 9757 words
Loaded 11137 words
100% 1014/1014 [00:00<00:00, 1096.27it/s]
Number of sentence pairs: 1014
Avg. num words in source text: 13.231755424063117
Std. num words in source text: 4.053699880023548
Max. num words in source text: 32
Min. num words in source text: 4
Avg. num words in target text: 16.355029585798817
Std. num words in target text: 4.774417087948714
Max. num words in target text: 38
Min. num words in target text: 7
100% 16/16 [00:03<00:00,  4.70it/s]
Test loss=1.89397681504488, Test Bleu=0.4152381840399443
Finished testing




## From French to English
Train the model with attention

In [8]:
%%shell
DATASET=Multi30k
TRAIN=../data/${DATASET}/Training/
TEST=../data/${DATASET}/Testing/

echo "Training the model from French to English"

python3 runner/train.py "${TRAIN}" \
    fr "../models/${DATASET}/vocab.french.gz" \
    en "../models/${DATASET}/vocab.english.gz" \
    "../models/${DATASET}/model.fr.en.pt" \
    --source-word-embedding-size 256 \
    --target-word-embedding-size 256 \
    --encoder-num-layers 3 \
    --encoder-num-attention-heads 8 \
    --encoder-pf-size 512 \
    --encoder-dropout 0.1 \
    --decoder-num-layers 3 \
    --decoder-num-attention-heads 8 \
    --decoder-pf-size 512 \
    --decoder-dropout 0.1 \
    --patience 5 \
    --train-val-ratio 0.75 \
    --batch-size 64 \
    --seed 1 \
    --device cuda \
    --resume-from-checkpoint "../models/${DATASET}/checkpoint.fr.en.pt" \
    --save-checkpoint-to "../models/${DATASET}/checkpoint.fr.en.pt" \

echo "Finished training"

Training the model
Loaded 11137 words
Loaded 9757 words
100% 29461/29461 [00:08<00:00, 3427.01it/s]
Number of sentence pairs: 29460
Avg. num words in source text: 14.263951120162933
Std. num words in source text: 4.727645134343328
Max. num words in source text: 47
Min. num words in source text: 4
Avg. num words in target text: 15.071554650373388
Std. num words in target text: 4.057199491280985
Max. num words in target text: 43
Min. num words in target text: 6
100% 346/346 [00:12<00:00, 28.22it/s]
100% 116/116 [00:16<00:00,  6.93it/s]
Epoch 1: Train loss=4.661181741367185, Val loss=3.9223991591354896, Val BLEU=0.09397709865372507
Saved model
Saved checkpoint
100% 346/346 [00:12<00:00, 27.90it/s]
100% 116/116 [00:18<00:00,  6.16it/s]
Epoch 2: Train loss=3.6129164523471986, Val loss=3.2914424888018905, Val BLEU=0.1430592261817805
Saved model
Saved checkpoint
100% 346/346 [00:12<00:00, 27.60it/s]
100% 116/116 [00:26<00:00,  4.39it/s]
Epoch 3: Train loss=3.0994713306427, Val loss=2.98386860



Test the model with attention from French to English

In [10]:
%%shell
DATASET=Multi30k
TRAIN=../data/${DATASET}/Training/
TEST=../data/${DATASET}/Testing/

echo "Testing the model with attention from French to English"

python3 runner/test.py $TEST \
    fr ../models/${DATASET}/vocab.french.gz \
    en ../models/${DATASET}/vocab.english.gz \
    ../models/${DATASET}/model.fr.en.pt \
    --source-word-embedding-size 256 \
    --target-word-embedding-size 256 \
    --encoder-num-layers 3 \
    --encoder-num-attention-heads 8 \
    --encoder-pf-size 512 \
    --encoder-dropout 0.1 \
    --decoder-num-layers 3 \
    --decoder-num-attention-heads 8 \
    --decoder-pf-size 512 \
    --decoder-dropout 0.1 \
    --device cuda \
    --batch-size 64 \

echo "Finished testing"

Testing the model with attention from French to English
Loaded 11137 words
Loaded 9757 words
100% 1014/1014 [00:00<00:00, 2235.47it/s]
Number of sentence pairs: 1014
Avg. num words in source text: 14.355029585798816
Std. num words in source text: 4.774417087948714
Max. num words in source text: 36
Min. num words in source text: 5
Avg. num words in target text: 15.231755424063117
Std. num words in target text: 4.053699880023548
Max. num words in target text: 34
Min. num words in target text: 6
100% 16/16 [00:03<00:00,  5.00it/s]
Test loss=2.255883067846298, Test Bleu=0.35582595922716986
Finished testing


