## What you will learn:
1. [Vocabulary: how to use](#voc)
2. [Load trained model](#load_model)
3. [Translate sentences](#translate)

### Before starting

First, let's add path to the the-story-of-heads repo to the path.

In [1]:
import sys

sys.path.insert(0, 'path_to_good_translation_wrong_in_context') # insert your local path to the repo

## Vocabulary <a name="voc"></a>

To load a model, you need to pass vocabularies used in training. Let's load the vocabularies.

In [2]:
import pickle
import numpy as np

DATA_PATH = # insert your path
VOC_PATH =  # insert your path

inp_voc = pickle.load(open(VOC_PATH + 'src.voc', 'rb'))
out_voc = pickle.load(open(VOC_PATH + 'dst.voc', 'rb'))

#### What you can do with a vocabulary

You can get ids of tokens in the vocabulary, as well as tokens corresponding to ids:

In [3]:
inp_voc.ids("i saw a cat".split())

[7, 253, 11, 1162]

In [4]:
inp_voc.words([12, 123, 1234, 12345])

["'s", 'by', '6', 'knack']

Reserved token ids are:

In [5]:
inp_voc.ids(['_BOS_', '_EOS_', '_UNK_'])

[0, 1, 2]

`_BOS_` - begin of sentence token; not used in the standard setting

`_EOS_` - end of sentence token; this is the last token of any sentence

`_UNK_` - unknown token; if you are using BPE, you probably won't see it.

## Load model <a name="load_model"></a>

Import liblaries and create session.

In [6]:
%env CUDA_VISIBLE_DEVICES=0

import tensorflow as tf
import lib
import lib.task.seq2seq.models.transformer as tr

tf.reset_default_graph()
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.99, allow_growth=True)
sess = tf.InteractiveSession(config=tf.ConfigProto(gpu_options=gpu_options))

env: CUDA_VISIBLE_DEVICES=1


First, copy model hyperparameters from your training config. In this notebook, we'll use model with pruned encoder self-attention heads.

In [7]:
hp = {
     "num_layers": 6,
     "num_heads": 8,
     "ff_size": 2048,
     "ffn_type": "conv_relu",
     "hid_size": 512,
     "emb_size": 512,
     "res_steps": "nlda", 
    
     "rescale_emb": True,
     "inp_emb_bias": True,
     "normalize_out": True,
     "share_emb": False,
     "replace": 0,
    
     "relu_dropout": 0.1,
     "res_dropout": 0.1,
     "attn_dropout": 0.1,
     "label_smoothing": 0.1,
    
     "translator": "ingraph",
     "beam_size": 4,
     "beam_spread": 3,
     "len_alpha": 0.6,
     "attn_beta": 0,
}

Now you can load the model. Pass vocs and hyperparameters.

In [8]:
model = tr.Model('mod', inp_voc, out_voc, inference_mode='fast', **hp)

#### Load checkpoint

In [9]:
path_to_ckpt = # insert path to the final checkpoint
var_list = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
lib.train.saveload.load(path_to_ckpt, var_list)

## Translate <a name="translate"></a>

In [11]:
# load test set
path_to_testset = # path to your data
test_src = open(path_to_testset + 'test.src').readlines()
test_dst = open(path_to_testset + 'test.dst').readlines()

To translate, just pass a list of sentence to the `translate_lines` function of the model:

In [12]:
test_src[2:5]

["otherwise , he 'll tell them the truth .\n",
 "there 's evidence of early ul `cer `ation .\n",
 "in my experience , if you really don 't want to worry about something , you lock it into a cage ...\n"]

In [13]:
model.translate_lines(test_src[:3])

['о , это ужасно , чувак .', 'другой ?', 'иначе он расскажет им правду .']

To translate a test set, just do this for a sequence of batches (50-100 sentences is ok).

**Do not forget to unbpe your translations before evaluating BLEU score!**

In [14]:
def unbpe(sent):
    return sent.replace(' `', '')

In [15]:
print(model.translate_lines(['i saw a hungry cat'])[0])
print(unbpe(model.translate_lines(['i saw a hungry cat'])[0]))

я видел голод `ного кота .
я видел голодного кота .
