<a href="https://colab.research.google.com/github/Neafiol/Caam-progect/blob/master/dssm/dssm_dz.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Seminar: simple question answering
![img](https://recruitlook.com/wp-content/uploads/2015/01/questionanswer3.jpg)

Today we're going to build a retrieval-based question answering model with metric learning models.

_this seminar is based on original notebook by [Oleg Vasilev](https://github.com/Omrigan/)_



In [0]:
import numpy as np
import matplotlib.pyplot as plt
import torch.nn as nn
%matplotlib inline

In [4]:
!wget https://raw.githubusercontent.com/yandexdataschool/Practical_DL/fall18/week11_dssm/utils.py

--2019-05-05 09:46:21--  https://raw.githubusercontent.com/yandexdataschool/Practical_DL/fall18/week11_dssm/utils.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9814 (9.6K) [text/plain]
Saving to: ‘utils.py’


2019-05-05 09:46:21 (108 MB/s) - ‘utils.py’ saved [9814/9814]



In [5]:
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

### Dataset

Today's data is Stanford Question Answering Dataset (SQuAD). Given a paragraph of text and a question, our model's task is to select a snippet that answers the question.

We are not going to solve the full task today. Instead, we'll train a model to __select the sentence containing answer__ among several options.

As usual, you are given an utility module with data reader and some helper functions

In [0]:
import utils
!wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json -O squad-v2.0.json 2> log
# backup download link: https://www.dropbox.com/s/q4fuihaerqr0itj/squad.tar.gz?dl=1
train, test = utils.build_dataset('./squad-v2.0.json', tokenized=True)

In [7]:
# the data comes pre-tokenized with this simple tokenizer:
utils.tokenize("I... I'm the monument to all your sins.")

"i ... i ' m the monument to all your sins ."

In [8]:
pid, question, options, correct_indices, wrong_indices = train.iloc[40]
print('QUESTION', question, '\n')
for i, cand in enumerate(options):
    print(['[ ]', '[v]'][i in correct_indices], cand)

QUESTION where did beyonce get her name from ? 

[ ] beyoncé giselle knowles was born in houston , texas , to celestine ann " tina " knowles ( née beyincé ), a hairdresser and salon owner , and mathew knowles , a xerox sales manager .
[v] beyoncé ' s name is a tribute to her mother ' s maiden name .
[ ] beyoncé ' s younger sister solange is also a singer and a former member of destiny ' s child .
[ ] mathew is african - american , while tina is of louisiana creole descent ( with african , native american , french , cajun , and distant irish and spanish ancestry ).
[ ] through her mother , beyoncé is a descendant of acadian leader joseph broussard .
[ ] she was raised in a methodist household .


### Tokens & vocabularies

The procedure here is very similar to previous nlp weeks: preprocess text into tokens, create dictionaries, etc.

In [0]:
import re
def prep(X):
  mat=[]
  for q in (X['question']):
    q=q.replace(',','')
    q=q.replace('.','')
    q=q.replace('?','')
    q=q.replace('`','')

    for w in q.split(' '):
      mat.append(w)

  for o in (X["options"]):
    for q in o:
      q=q.replace(',','')
      q=q.replace('.','')
      q=q.replace('?','')
      q=q.replace('`','')

      for w in q.split(' '):
        mat.append(w)
  return mat

mat=prep(train)
mat2=prep(test)



In [12]:
len(mat)

9042915

In [0]:
from tqdm import tqdm, trange
from collections import Counter, defaultdict

#Dictionary of {token : count}
token_counts = Counter(mat)
token_counts2 = Counter(mat2)


# compute counts for each token; use token_counts;
# count BOTH in train['question'] and in train['options']


In [14]:
token_counts['me']

575

In [15]:
print("Total tokens:", sum(token_counts.values()))
print("Most common:", token_counts.most_common(5))
assert 9000000 < sum(token_counts.values()) < 9100000, "are you sure you counted all unique tokens in questions and options?"

Total tokens: 9042915
Most common: [('', 802262), ('the', 597790), ('of', 300056), ('and', 231619), ('in', 215312)]


We shall only keep tokens that are present at least 4 times

In [16]:
MIN_COUNT = 5

tokens = [c for c in token_counts if token_counts[c] >  MIN_COUNT] 
tokens = ["_PAD_", "_UNK_"] + tokens
print("Tokens left:", len(tokens))

tokens2 = [c for c in token_counts2 if token_counts2[c] >  MIN_COUNT] 
tokens2 = ["_PAD_", "_UNK_"] + tokens2

Tokens left: 40801


In [0]:
# a dictionary from token to it's index in tokens
token_to_id={}
for i in range(len(tokens)) :
  token_to_id[tokens[i]] = i 

In [18]:
token_to_id["the"]

37

In [0]:
assert token_to_id['me'] != token_to_id['woods']
assert token_to_id[tokens[42]]==42
assert len(token_to_id)==len(tokens)

In [0]:
PAD_ix = token_to_id["_PAD_"]
UNK_ix = token_to_id['_UNK_']

#good old as_matrix for the third time
def as_matrix(sequences, max_len=None):
    if isinstance(sequences[0], (str, bytes)):
        sequences = [utils.tokenize(s).split() for s in sequences]
        
    max_len = max_len or max(map(len,sequences))
    
    matrix = np.zeros((len(sequences), max_len), dtype='int32') + PAD_ix
    for i, seq in enumerate(sequences):
        row_ix = [token_to_id.get(word, UNK_ix) for word in seq[:max_len]]
        matrix[i, :len(row_ix)] = row_ix
    
    return matrix

In [21]:
test = as_matrix(["Definitely, thOsE tokens areN'T LowerCASE!!", "I'm the monument to all your sins."])
print(test)
assert test.shape[0]==2
print("Correct!")

[[11174     1  1994  7720  7406    19   972 12207     1     0]
 [  196    19  2521    37  6800    49   647  6234 24359     1]]
Correct!


### Data sampler

Our model trains on triplets: $<query, answer^+, answer^->$

For your convenience, we've implemented a function that samples such triplets from data

In [0]:
import random
import torch
lines_to_tensor = lambda lines, max_len=None: torch.tensor(
    as_matrix(lines, max_len=max_len), dtype=torch.int64)

def iterate_minibatches(data, batch_size, shuffle=True, cycle=False):
    """
    Generates minibatches of triples: {questions, correct answers, wrong answers}
    If there are several wrong (or correct) answers, picks one at random.
    """
    indices = np.arange(len(data))
    while True:
        if shuffle:
            indices = np.random.permutation(indices)
        for batch_start in range(0, len(indices), batch_size):
            batch_indices = indices[batch_start: batch_start + batch_size]
            batch = data.iloc[batch_indices]
            questions = batch['question'].values
            correct_answers = np.array([
                row['options'][random.choice(row['correct_indices'])]
                for i, row in batch.iterrows()
            ])
            wrong_answers = np.array([
                row['options'][random.choice(row['wrong_indices'])]
                for i, row in batch.iterrows()
            ])

            yield {
                'questions' : lines_to_tensor(questions),
                'correct_answers': lines_to_tensor(correct_answers),
                'wrong_answers': lines_to_tensor(wrong_answers),
            }
        if not cycle:
            break

In [0]:
test=train[50000:]
train=train[:5000]

In [0]:
dummy_batch = next(iterate_minibatches(train.sample(10), 2))
print(dummy_batch["questions"])
print(dummy_batch["correct_answers"])
print(dummy_batch["wrong_answers"])

### Building the model (3 points)

Our goal for today is to build a model that measures similarity between question and answer. In particular, it maps both question and answer into fixed-size vectors such that:

Our model is a pair of $V_q(q)$ and $V_a(a)$ - networks that turn phrases into vectors. 

__Objective:__ Question vector $V_q(q)$ should be __closer__ to correct answer vectors $V_a(a^+)$ than to incorrect ones $V_a(a^-)$ .

Both vectorizers can be anything you wish. For starters, let's use a convolutional network with global pooling and a couple of dense layers on top.

It is perfectly legal to share some layers between vectorizers, but make sure they are at least a little different.

In [0]:
import torch, torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable

class GlobalMaxPooling(nn.Module):
    def __init__(self, dim=-1):
        super(self.__class__, self).__init__()
        self.dim = dim
        
    def forward(self, x):
        return x.max(dim=self.dim)[0]

In [0]:
# we might as well create a global embedding layer here

GLOBAL_EMB = nn.Embedding(len(tokens), 64, padding_idx=PAD_ix)

In [0]:
class QuestionVectorizer(nn.Module):

    def __init__(self, n_tokens=len(tokens), out_size=64, use_global_emb=True):
        """ 
        A simple sequential encoder for questions.
        Use any combination of layers you want to encode a variable-length input 
        to a fixed-size output vector
        
        If use_global_emb is True, use GLOBAL_EMB as your embedding layer
        """
        super(self.__class__, self).__init__()
        
        if use_global_emb:
            self.emb = GLOBAL_EMB
        else:
            self.emb = nn.Embedding(n_tokens, 64, padding_idx=PAD_ix)
            
        
        self.rnn = nn.RNN(64, out_size, num_layers =1 , dropout=0.05)
        
    def forward(self, text_ix):
        """
        :param text_ix: int64 Variable of shape [batch_size, max_len]
        :returns: float32 Variable of shape [batch_size, out_size]
        """
#         self.inp = nn.Linear(1, hidden_size)
       
        e = self.emb(text_ix)
        e = torch.transpose(e,1,0)
        
        h_0 = torch.zeros(e.shape)
      
        output, hn = self.rnn(e)
#         self.out = nn.Linear(hidden_size, 1)

        return output[-1]

In [0]:
class AnswerVectorizer(nn.Module):
    def __init__(self, n_tokens=len(tokens), out_size=64, use_global_emb=True):
        """ 
        A simple sequential encoder for questions.
        Use any combination of layers you want to encode a variable-length input 
        to a fixed-size output vector
        
        If use_global_emb is True, use GLOBAL_EMB as your embedding layer
        """
        super(self.__class__, self).__init__()
        
        if use_global_emb:
            self.emb = GLOBAL_EMB
        else:
            self.emb = nn.Embedding(n_tokens, 64, padding_idx=PAD_ix)
            
        
        self.rnn = nn.RNN(64, out_size, num_layers =1 , dropout=0.05)
        
    def forward(self, text_ix):
        """
        :param text_ix: int64 Variable of shape [batch_size, max_len]
        :returns: float32 Variable of shape [batch_size, out_size]
        """
#         self.inp = nn.Linear(1, hidden_size)
       
        e = self.emb(text_ix)
        e = torch.transpose(e,1,0)
        
        h_0 = torch.zeros(e.shape)
      
        output, hn = self.rnn(e)
#         self.out = nn.Linear(hidden_size, 1)

        return output[-1]

In [0]:
q=AnswerVectorizer(out_size=100)
q(torch.LongTensor(test)).shape

In [0]:
for vectorizer in [QuestionVectorizer(out_size=100), AnswerVectorizer(out_size=100)]:
  
    print("Testing %s ..." % vectorizer.__class__.__name__)
    dummy_x = torch.LongTensor(test)
    dummy_v = vectorizer(dummy_x)
    assert tuple(dummy_v.shape) == (dummy_x.shape[0], 100)

    del vectorizer
    print("Seems fine")

In [0]:
dummy_v.shape

### Training: loss function (3 points)
We want our vectorizers to put correct answers closer to question vectors and incorrect answers farther away from them. One way to express this is to use is Pairwise Hinge Loss _(aka Triplet Loss)_. 

$$ L = \frac 1N \underset {q, a^+, a^-} \sum max(0, \space \delta - sim[V_q(q), V_a(a^+)] + sim[V_q(q), V_a(a^-)] )$$

, where
* sim[a, b] is some similarity function: dot product, cosine or negative distance
* δ - loss hyperparameter, e.g. δ=1.0. If sim[a, b] is linear in b, all δ > 0 are equivalent.


This reads as __Correct answers must be closer than the wrong ones by at least δ.__

![img](https://raw.githubusercontent.com/yandexdataschool/nlp_course/master/resources/margin.png)
<center>_image: question vector is green, correct answers are blue, incorrect answers are red_</center>


Note: in effect, we train a Deep Semantic Similarity Model [DSSM](https://www.microsoft.com/en-us/research/project/dssm/). 

In [0]:
def sim(a,b):
  
  return torch.dot(a.view(-1),b.view(-1))

def compute_loss(anchors, positives, negatives, delta=1):
    """ 
    Compute the triplet loss:
    
    max(0, delta + sim(anchors, negatives) - sim(anchors, positives))
    
    where sim is a dot-product between vectorized inputs
    
    """
    loss = torch.max( delta + sim(anchors, negatives) - sim(anchors, positives) ,0)
#     loss = delta + sim(anchors, negatives) - sim(anchors, positives)
#     print(sim(anchors, negatives) ,sim(anchors, positives) )

    
    return loss

In [0]:
def compute_recall(anchors, positives, negatives, delta=1):
    """
    Compute the probability (ratio) at which sim(anchors, negatives) is greater than sim(anchors, positives)
    """
    ratio = sim(anchors, negatives)/sim(anchors, positives) 
    return ratio

### Training loop (4 points)

For a difference, we'll ask __you__ to implement training loop this time.

Here's a sketch of one epoch:
1. iterate over __`batches_per_epoch`__ batches from __`train_data`__ with __`iterate_minibatches`__
    * Compute loss, backprop, optimize
    * Compute and accumulate recall
    
2. iterate over __`batches_per_epoch`__ batches from __`val_data`__
    * Compute and accumulate recall
    
3. print stuff :)


In [0]:
num_epochs = 100
max_len = 100
batch_size = 32
batches_per_epoch = 100

In [159]:
from itertools import chain

question_vectorizer = QuestionVectorizer(out_size=max_len)
answer_vectorizer = AnswerVectorizer(out_size=max_len)

optimizer = torch.optim.Adam(chain(question_vectorizer.parameters(),
                             answer_vectorizer.parameters()))

  "num_layers={}".format(dropout, num_layers))


In [160]:
loss

tensor(-346.1432, grad_fn=<SubBackward0>)

In [161]:
for e in range(batches_per_epoch):
  for data in iterate_minibatches(train, batch_size):
    
    _dummy_anchors=question_vectorizer(dummy_batch["questions"])
    _dummy_positives = answer_vectorizer(dummy_batch["correct_answers"])
    _dummy_negatives = answer_vectorizer(dummy_batch["wrong_answers"])

  #   print( compute_recall(_dummy_anchors, _dummy_positives, _dummy_negatives) )
    

    loss = compute_loss(_dummy_anchors, _dummy_positives, _dummy_negatives)
    

    question_vectorizer.zero_grad()
    answer_vectorizer.zero_grad()

    loss.backward()
    optimizer.step()
#     print( loss ) 

  print( loss ) 

tensor(2.2049, grad_fn=<SubBackward0>)
tensor(-5.1287, grad_fn=<SubBackward0>)
tensor(-12.4397, grad_fn=<SubBackward0>)
tensor(-19.9513, grad_fn=<SubBackward0>)
tensor(-27.8393, grad_fn=<SubBackward0>)
tensor(-36.2341, grad_fn=<SubBackward0>)
tensor(-45.2256, grad_fn=<SubBackward0>)
tensor(-54.8554, grad_fn=<SubBackward0>)
tensor(-65.1042, grad_fn=<SubBackward0>)
tensor(-75.8964, grad_fn=<SubBackward0>)
tensor(-87.1177, grad_fn=<SubBackward0>)
tensor(-98.6352, grad_fn=<SubBackward0>)
tensor(-110.3139, grad_fn=<SubBackward0>)
tensor(-122.0317, grad_fn=<SubBackward0>)
tensor(-133.6899, grad_fn=<SubBackward0>)
tensor(-145.2172, grad_fn=<SubBackward0>)
tensor(-156.5695, grad_fn=<SubBackward0>)
tensor(-167.7290, grad_fn=<SubBackward0>)
tensor(-178.7047, grad_fn=<SubBackward0>)
tensor(-189.5339, grad_fn=<SubBackward0>)
tensor(-200.2828, grad_fn=<SubBackward0>)
tensor(-211.0421, grad_fn=<SubBackward0>)
tensor(-221.9093, grad_fn=<SubBackward0>)
tensor(-232.9450, grad_fn=<SubBackward0>)
tensor(

KeyboardInterrupt: ignored

### Evaluation

Let's see how our model performs on actual question answering. You will score answer candidates with your model and select the most appropriate one.

__Your goal__ is to obtain accuracy of at least above 50%. Beating 65% in this notebook yields bonus points :)

In [0]:
# optional: prepare some functions here
# <...>

def select_best_answer(question, possible_answers):
  
    """
    Predicts which answer best fits the question
    :param question: a single string containing a question
    :param possible_answers: a list of strings containing possible answers
    :returns: integer - the index of best answer in possible_answer
    """
  
    q=question_vectorizer(question)
    nans=0
    now = -1e9
    
    for i , a in enumirate(possible_answers):
      ans=answer_vectorizer(a)
      if(sim(q, ans)> now):
        now = sim(q, ans)
        nans=i
        
    return possible_answers[nans]
    

    

In [192]:
for t in test[["question"]].iterrows():
  print(t)
  break

(73089, question    in what year was caesar given the power of a c...
Name: 73089, dtype: object)


In [185]:
predicted_answers = [
    select_best_answer(question, possible_answers)
    for i, (question, possible_answers) in tqdm(test[['question', 'options']].iterrows(), total=len(test))
]

accuracy = np.mean([
    answer in correct_ix
    for answer, correct_ix in zip(predicted_answers, test['correct_indices'].values)
])
print("Accuracy: %0.5f" % accuracy)
assert accuracy > 0.65, "we need more accuracy!"
print("Great job!")


  0%|          | 0/8161 [00:00<?, ?it/s][A
  1%|          | 94/8161 [00:00<00:08, 933.42it/s][A


[[ 196]
 [4857]
 [   0]
 [4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [8197]
 [ 473]
 [  24]
 [  33]
 [   0]
 [4135]
 [  24]
 [  20]
 [   0]
 [ 920]
 [  24]
 [ 473]
 [  20]
 [  24]
 [  33]
 [   0]
 [8440]
 [ 196]
 [6069]
 [ 473]
 [4857]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [1860]
 [8289]
 [4135]
 [ 473]
 [  33]
 [   0]
 [8289]
 [1218]
 [   0]
 [  24]
 [   0]
 [ 920]
 [ 473]
 [4857]
 [  20]
 [8289]
 [  33]
 [   0]
 [   1]]
[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [ 196]
 [  20]
 [   0]
 [ 920]
 [8289]
 [4857]
 [  20]
 [ 196]
 [6523]
 [ 473]
 [  33]
 [ 473]
 [6523]
 [   0]
 [ 972]
 [8289]
 [   0]
 [7819]
 [  24]
 [6069]
 [ 473]
 [   0]
 [ 920]
 [8289]
 [ 336]
 [4857]
 [ 972]
 [ 473]
 [  33]
 [  24]
 [ 920]
 [ 972]
 [ 473]
 [6523]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [1860]
 [  33]
 [ 473]
 [  20]
 [ 972]
 [ 196]
 [8440]
 [ 473]
 [   0]
 [8289]
 [1218]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [  20]
 [ 473]
 [4857]
 [  24]
 [ 972]
 [8289]
 [  33]
 [ 196]
 [  24]
 [ 689]
 [   0]
 [  24]

  2%|▏         | 189/8161 [00:00<00:08, 935.91it/s][A
  3%|▎         | 257/8161 [00:00<00:09, 840.80it/s][A
  4%|▍         | 352/8161 [00:00<00:08, 869.82it/s][A

[[ 4135]
 [ 7819]
 [  473]
 [ 4857]
 [    0]
 [ 6523]
 [  196]
 [ 6523]
 [    0]
 [ 1545]
 [   24]
 [ 1860]
 [   24]
 [ 4857]
 [    0]
 [   35]
 [  473]
 [  920]
 [ 8289]
 [ 2521]
 [  473]
 [    0]
 [   24]
 [ 4857]
 [    0]
 [   24]
 [10460]
 [  196]
 [   20]
 [    0]
 [ 1860]
 [ 8289]
 [ 4135]
 [  473]
 [   33]
 [    0]
 [    1]]
[[ 7819]
 [ 8289]
 [ 4135]
 [    0]
 [ 2521]
 [   24]
 [ 4857]
 [ 8197]
 [    0]
 [ 4857]
 [   24]
 [  972]
 [  196]
 [ 8289]
 [ 4857]
 [   20]
 [    0]
 [  920]
 [ 8289]
 [ 2521]
 [ 1860]
 [   33]
 [  196]
 [   20]
 [  473]
 [ 6523]
 [    0]
 [  972]
 [ 7819]
 [  473]
 [    0]
 [   24]
 [10460]
 [  196]
 [   20]
 [    0]
 [ 1860]
 [ 8289]
 [ 4135]
 [  473]
 [   33]
 [   20]
 [    0]
 [    1]]
[[7819]
 [8289]
 [4135]
 [   0]
 [2521]
 [ 336]
 [ 920]
 [7819]
 [   0]
 [ 920]
 [8289]
 [8289]
 [  33]
 [6523]
 [ 196]
 [4857]
 [  24]
 [ 972]
 [ 196]
 [8289]
 [4857]
 [   0]
 [4135]
 [  24]
 [  20]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [  33]
 [ 473]
 [   0]
 [  35]
 [ 47


  5%|▌         | 421/8161 [00:00<00:09, 803.78it/s][A
  6%|▋         | 512/8161 [00:00<00:09, 832.42it/s][A

[[ 336]
 [  20]
 [ 196]
 [4857]
 [8440]
 [   0]
 [  24]
 [   0]
 [ 972]
 [8289]
 [ 972]
 [  24]
 [ 689]
 [   0]
 [  35]
 [ 689]
 [8289]
 [ 920]
 [5491]
 [  24]
 [6523]
 [ 473]
 [   0]
 [  24]
 [4857]
 [6523]
 [   0]
 [  24]
 [ 196]
 [  33]
 [   0]
 [  33]
 [  24]
 [ 196]
 [6523]
 [  20]
 [   0]
 [   1]
 [   0]
 [4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [6523]
 [ 196]
 [6523]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [ 336]
 [  20]
 [   0]
 [4857]
 [  24]
 [6069]
 [8197]
 [   0]
 [7819]
 [8289]
 [1860]
 [ 473]
 [   0]
 [ 972]
 [8289]
 [   0]
 [1218]
 [8289]
 [  33]
 [ 920]
 [ 473]
 [   0]
 [   1]]
[[7819]
 [8289]
 [4135]
 [   0]
 [2521]
 [  24]
 [4857]
 [8197]
 [   0]
 [1545]
 [  24]
 [1860]
 [  24]
 [4857]
 [ 473]
 [  20]
 [ 473]
 [   0]
 [ 972]
 [  33]
 [8289]
 [8289]
 [1860]
 [  20]
 [   0]
 [6523]
 [ 473]
 [1218]
 [ 473]
 [4857]
 [6523]
 [ 473]
 [6523]
 [   0]
 [8289]
 [5491]
 [ 196]
 [4857]
 [  24]
 [4135]
 [  24]
 [   0]
 [   1]]
[[7819]
 [8289]
 [4135]
 [   0]
 [6523]
 [ 196]
 [6523


  7%|▋         | 586/8161 [00:00<00:09, 787.17it/s][A
  8%|▊         | 680/8161 [00:00<00:09, 825.37it/s][A


[[ 196]
 [4857]
 [   0]
 [4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [8197]
 [ 473]
 [  24]
 [  33]
 [   0]
 [6523]
 [ 196]
 [6523]
 [   0]
 [ 336]
 [7819]
 [1218]
 [   0]
 [ 920]
 [7819]
 [  24]
 [4857]
 [4857]
 [ 473]
 [ 689]
 [  20]
 [   0]
 [  35]
 [ 473]
 [8440]
 [ 196]
 [4857]
 [8440]
 [   0]
 [  35]
 [ 473]
 [ 196]
 [4857]
 [8440]
 [   0]
 [ 689]
 [ 196]
 [ 920]
 [ 473]
 [4857]
 [  20]
 [ 473]
 [6523]
 [   0]
 [  35]
 [8197]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [1218]
 [ 920]
 [ 920]
 [   0]
 [   1]]
[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [ 196]
 [4857]
 [6523]
 [ 473]
 [1860]
 [ 473]
 [4857]
 [6523]
 [ 473]
 [4857]
 [ 972]
 [   0]
 [  20]
 [ 972]
 [  24]
 [ 972]
 [ 196]
 [8289]
 [4857]
 [   0]
 [ 196]
 [  20]
 [   0]
 [ 196]
 [4857]
 [   0]
 [  20]
 [  24]
 [4857]
 [   0]
 [6523]
 [ 196]
 [ 473]
 [8440]
 [8289]
 [   0]
 [   1]]
[[ 7819]
 [ 8289]
 [ 4135]
 [    0]
 [ 2521]
 [   24]
 [ 4857]
 [ 8197]
 [    0]
 [ 6523]
 [  336]
 [ 8289]
 [ 1860]
 [ 8289]
 [  689]
 [  196]
 [  4


  9%|▉         | 762/8161 [00:00<00:08, 823.56it/s][A
 10%|█         | 842/8161 [00:01<00:09, 781.36it/s][A


[[6523]
 [ 336]
 [ 473]
 [   0]
 [ 972]
 [8289]
 [   0]
 [ 196]
 [4857]
 [ 920]
 [  33]
 [ 473]
 [  24]
 [  20]
 [ 473]
 [6523]
 [   0]
 [ 336]
 [4857]
 [ 473]
 [2521]
 [1860]
 [ 689]
 [8289]
 [8197]
 [2521]
 [ 473]
 [4857]
 [ 972]
 [   0]
 [   1]
 [   0]
 [4135]
 [7819]
 [8289]
 [   0]
 [2521]
 [  24]
 [ 196]
 [4857]
 [ 689]
 [8197]
 [   0]
 [8289]
 [1860]
 [1860]
 [8289]
 [  20]
 [ 473]
 [6523]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [  20]
 [7819]
 [  24]
 [7819]
 [   0]
 [  19]
 [   0]
 [  20]
 [   0]
 [  33]
 [ 473]
 [8440]
 [ 196]
 [2521]
 [ 473]
 [   0]
 [   1]]
[[4135]
 [7819]
 [8289]
 [   0]
 [ 196]
 [4857]
 [6069]
 [  24]
 [6523]
 [ 473]
 [6523]
 [   0]
 [ 196]
 [  33]
 [  24]
 [4857]
 [   0]
 [ 196]
 [4857]
 [   0]
 [ 711]
 [2764]
 [2088]
 [2762]
 [   0]
 [   1]]
[[4135]
 [7819]
 [ 473]
 [4857]
 [   0]
 [6523]
 [ 196]
 [6523]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [ 196]
 [  33]
 [  24]
 [4857]
 [ 196]
 [  24]
 [4857]
 [   0]
 [  24]
 [  33]
 [2521]
 [8197]
 [   0]
 [186


 11%|█▏        | 930/8161 [00:01<00:08, 807.86it/s][A
 12%|█▏        | 1011/8161 [00:01<00:09, 777.29it/s][A


[[4135]
 [7819]
 [ 473]
 [4857]
 [   0]
 [  20]
 [ 473]
 [  24]
 [   0]
 [ 689]
 [ 473]
 [6069]
 [ 473]
 [ 689]
 [  20]
 [   0]
 [  33]
 [8289]
 [  20]
 [ 473]
 [   0]
 [4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [8289]
 [ 920]
 [ 920]
 [ 336]
 [  33]
 [ 473]
 [6523]
 [   0]
 [ 196]
 [4857]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [  35]
 [  33]
 [ 196]
 [ 972]
 [ 196]
 [  20]
 [7819]
 [   0]
 [ 196]
 [  20]
 [ 689]
 [ 473]
 [  20]
 [   0]
 [  24]
 [  33]
 [ 473]
 [  24]
 [   0]
 [   1]]
[[4135]
 [7819]
 [ 473]
 [4857]
 [   0]
 [6523]
 [ 196]
 [6523]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [  35]
 [  33]
 [ 196]
 [ 972]
 [ 196]
 [  20]
 [7819]
 [   0]
 [ 196]
 [  20]
 [ 689]
 [ 473]
 [  20]
 [   0]
 [  24]
 [  33]
 [ 473]
 [  24]
 [   0]
 [  35]
 [ 473]
 [ 920]
 [8289]
 [2521]
 [ 473]
 [   0]
 [  20]
 [ 473]
 [1860]
 [  24]
 [  33]
 [  24]
 [ 972]
 [ 473]
 [6523]
 [   0]
 [1218]
 [  33]
 [8289]
 [2521]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [ 473]
 [ 336]
 [  33]
 [8289]
 [1860]
 [ 473


 14%|█▎        | 1111/8161 [00:01<00:08, 831.08it/s][A
 15%|█▍        | 1195/8161 [00:01<00:08, 792.86it/s][A
 16%|█▌        | 1283/8161 [00:01<00:08, 815.51it/s][A

[[  24]
 [2521]
 [ 473]
 [  33]
 [ 196]
 [ 920]
 [  24]
 [4857]
 [   0]
 [  35]
 [ 336]
 [ 196]
 [ 689]
 [6523]
 [ 196]
 [4857]
 [8440]
 [  20]
 [   0]
 [ 196]
 [4857]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [8440]
 [ 473]
 [8289]
 [  33]
 [8440]
 [ 196]
 [  24]
 [4857]
 [   0]
 [1860]
 [ 473]
 [  33]
 [ 196]
 [8289]
 [6523]
 [   0]
 [4135]
 [ 473]
 [  33]
 [ 473]
 [   0]
 [1218]
 [  33]
 [ 473]
 [2572]
 [ 336]
 [ 473]
 [4857]
 [ 972]
 [ 689]
 [8197]
 [   0]
 [ 920]
 [8289]
 [4857]
 [  20]
 [ 972]
 [  33]
 [ 336]
 [ 920]
 [ 972]
 [ 473]
 [6523]
 [   0]
 [8289]
 [1218]
 [   0]
 [4135]
 [7819]
 [ 196]
 [ 920]
 [7819]
 [   0]
 [2521]
 [  24]
 [ 972]
 [ 473]
 [  33]
 [ 196]
 [  24]
 [ 689]
 [  20]
 [   0]
 [   1]]
[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [4135]
 [  24]
 [  20]
 [   0]
 [6523]
 [ 196]
 [1218]
 [1218]
 [ 196]
 [ 920]
 [ 336]
 [ 689]
 [ 972]
 [   0]
 [ 972]
 [8289]
 [   0]
 [8289]
 [  35]
 [ 972]
 [  24]
 [ 196]
 [4857]
 [   0]
 [  24]
 [4857]
 [6523]
 [   0]
 [ 972]
 [  33]


 17%|█▋        | 1366/8161 [00:01<00:08, 817.43it/s][A
 18%|█▊        | 1449/8161 [00:01<00:08, 805.52it/s][A


[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [ 920]
 [  24]
 [4857]
 [   0]
 [  35]
 [ 473]
 [   0]
 [  20]
 [  24]
 [ 196]
 [6523]
 [   0]
 [  24]
 [  35]
 [8289]
 [ 336]
 [ 972]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [  24]
 [ 920]
 [ 920]
 [ 336]
 [  33]
 [  24]
 [ 920]
 [8197]
 [   0]
 [8289]
 [1218]
 [   0]
 [1860]
 [  33]
 [ 473]
 [  20]
 [ 473]
 [4857]
 [ 972]
 [  24]
 [ 972]
 [ 196]
 [8289]
 [4857]
 [  24]
 [ 689]
 [   0]
 [ 196]
 [2521]
 [2521]
 [ 473]
 [6523]
 [ 196]
 [  24]
 [ 920]
 [8197]
 [   0]
 [   1]]
[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [ 196]
 [  20]
 [   0]
 [4135]
 [7819]
 [ 196]
 [ 972]
 [ 473]
 [7819]
 [ 473]
 [  24]
 [6523]
 [   0]
 [  19]
 [   0]
 [  20]
 [   0]
 [ 972]
 [ 473]
 [  33]
 [2521]
 [   0]
 [1218]
 [8289]
 [  33]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [ 972]
 [4135]
 [8289]
 [   0]
 [2521]
 [8289]
 [6523]
 [ 473]
 [  20]
 [   0]
 [8289]
 [1218]
 [   0]
 [1860]
 [ 473]
 [  33]
 [ 920]
 [ 473]
 [1860]
 [ 972]
 [ 196]
 [8289]
 [4857]
 [  20]
 [   0]

 19%|█▉        | 1543/8161 [00:01<00:07, 840.29it/s][A
 20%|█▉        | 1628/8161 [00:01<00:08, 780.61it/s][A

[[4135]
 [7819]
 [ 473]
 [4857]
 [   0]
 [4135]
 [  24]
 [  20]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [1218]
 [ 196]
 [  33]
 [  20]
 [ 972]
 [   0]
 [4135]
 [ 196]
 [4857]
 [6523]
 [8289]
 [4135]
 [  20]
 [   0]
 [2088]
 [   0]
 [1860]
 [  24]
 [ 972]
 [ 920]
 [7819]
 [   0]
 [  20]
 [ 473]
 [4857]
 [ 972]
 [   0]
 [8289]
 [ 336]
 [ 972]
 [   0]
 [   1]]
[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [ 196]
 [  20]
 [   0]
 [  20]
 [5491]
 [8197]
 [6523]
 [  33]
 [ 196]
 [6069]
 [ 473]
 [   0]
 [   1]]
[[ 4135]
 [ 7819]
 [   24]
 [  972]
 [    0]
 [ 1218]
 [  473]
 [   24]
 [  972]
 [  336]
 [   33]
 [  473]
 [   20]
 [    0]
 [ 6523]
 [ 8289]
 [  473]
 [   20]
 [    0]
 [10460]
 [   35]
 [ 8289]
 [10460]
 [    0]
 [  689]
 [  196]
 [ 6069]
 [  473]
 [    0]
 [ 7819]
 [  336]
 [   35]
 [    0]
 [ 8289]
 [ 1218]
 [ 1218]
 [  473]
 [   33]
 [    0]
 [    1]]
[[4135]
 [7819]
 [ 473]
 [4857]
 [   0]
 [4135]
 [  24]
 [  20]
 [   0]
 [1218]
 [  24]
 [ 920]
 [ 473]
 [  35]
 [8289]
 [8289]
 [549


 21%|██        | 1726/8161 [00:02<00:07, 830.39it/s][A
 22%|██▏       | 1811/8161 [00:02<00:07, 806.89it/s][A
 23%|██▎       | 1897/8161 [00:02<00:07, 821.03it/s][A

[[4135]
 [7819]
 [8289]
 [   0]
 [ 196]
 [  20]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [  24]
 [4857]
 [8440]
 [ 689]
 [ 196]
 [ 920]
 [  24]
 [4857]
 [   0]
 [  35]
 [ 196]
 [  20]
 [7819]
 [8289]
 [1860]
 [   0]
 [8289]
 [1218]
 [   0]
 [  20]
 [4135]
 [  24]
 [ 486]
 [ 196]
 [ 689]
 [  24]
 [4857]
 [6523]
 [   0]
 [   1]]
[[7819]
 [8289]
 [4135]
 [   0]
 [2521]
 [  24]
 [4857]
 [8197]
 [   0]
 [1545]
 [ 473]
 [4135]
 [ 196]
 [  20]
 [7819]
 [   0]
 [1218]
 [  24]
 [2521]
 [ 196]
 [ 689]
 [ 196]
 [ 473]
 [  20]
 [   0]
 [  24]
 [  33]
 [ 473]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [  33]
 [ 473]
 [   0]
 [ 196]
 [4857]
 [   0]
 [  20]
 [4135]
 [  24]
 [ 486]
 [ 196]
 [ 689]
 [  24]
 [4857]
 [6523]
 [   0]
 [   1]]
[[7819]
 [8289]
 [4135]
 [   0]
 [2521]
 [  24]
 [4857]
 [8197]
 [   0]
 [8197]
 [ 473]
 [  24]
 [  33]
 [  20]
 [   0]
 [6523]
 [8289]
 [ 473]
 [  20]
 [   0]
 [  24]
 [   0]
 [  20]
 [ 972]
 [ 336]
 [6523]
 [ 473]
 [4857]
 [ 972]
 [   0]
 [  20]
 [1860]
 [ 473]
 [4857]
 [6523


 24%|██▍       | 1985/8161 [00:02<00:07, 837.74it/s][A
 25%|██▌       | 2070/8161 [00:02<00:07, 799.50it/s][A


[[4135]
 [7819]
 [ 196]
 [ 920]
 [7819]
 [   0]
 [6523]
 [ 473]
 [6069]
 [ 473]
 [ 689]
 [8289]
 [1860]
 [ 473]
 [  33]
 [   0]
 [  35]
 [ 473]
 [8440]
 [  24]
 [4857]
 [   0]
 [  33]
 [ 473]
 [ 689]
 [ 473]
 [  24]
 [  20]
 [ 196]
 [4857]
 [8440]
 [   0]
 [8440]
 [  24]
 [2521]
 [ 473]
 [  20]
 [   0]
 [1218]
 [8289]
 [  33]
 [   0]
 [  35]
 [8289]
 [ 972]
 [7819]
 [   0]
 [4857]
 [ 473]
 [  20]
 [   0]
 [  24]
 [4857]
 [6523]
 [   0]
 [  20]
 [ 473]
 [8440]
 [  24]
 [   0]
 [ 196]
 [4857]
 [   0]
 [ 711]
 [2764]
 [2764]
 [ 711]
 [   0]
 [   1]]
[[7819]
 [8289]
 [4135]
 [   0]
 [6523]
 [ 196]
 [6523]
 [   0]
 [4857]
 [ 196]
 [4857]
 [ 972]
 [ 473]
 [4857]
 [6523]
 [8289]
 [   0]
 [  33]
 [  24]
 [4857]
 [5491]
 [   0]
 [  20]
 [ 336]
 [  35]
 [2521]
 [ 196]
 [ 972]
 [ 972]
 [ 473]
 [6523]
 [   0]
 [8440]
 [  24]
 [2521]
 [ 473]
 [  20]
 [   0]
 [   1]]
[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [8440]
 [  24]
 [2521]
 [ 473]
 [   0]
 [1218]
 [ 196]
 [  33]
 [  20]
 [ 972]
 [   0]
 [ 920

 26%|██▋       | 2155/8161 [00:02<00:07, 813.51it/s][A
 27%|██▋       | 2238/8161 [00:02<00:07, 792.54it/s][A
 29%|██▊       | 2338/8161 [00:02<00:06, 844.62it/s][A

[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [4857]
 [  24]
 [2521]
 [ 473]
 [   0]
 [6523]
 [ 196]
 [6523]
 [   0]
 [6523]
 [ 473]
 [   0]
 [1860]
 [ 473]
 [8197]
 [  20]
 [ 972]
 [ 473]
 [  33]
 [   0]
 [8440]
 [ 196]
 [6069]
 [ 473]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [ 972]
 [ 336]
 [6069]
 [  24]
 [ 689]
 [ 336]
 [   0]
 [ 196]
 [  20]
 [ 689]
 [  24]
 [4857]
 [6523]
 [  20]
 [   0]
 [   1]]
[[1218]
 [8289]
 [  33]
 [   0]
 [4135]
 [7819]
 [8289]
 [2521]
 [   0]
 [6523]
 [ 196]
 [6523]
 [   0]
 [6523]
 [ 473]
 [   0]
 [1860]
 [ 473]
 [8197]
 [  20]
 [ 972]
 [ 473]
 [  33]
 [   0]
 [4857]
 [  24]
 [2521]
 [ 473]
 [   0]
 [ 473]
 [ 689]
 [ 689]
 [ 196]
 [ 920]
 [ 473]
 [   0]
 [  19]
 [   0]
 [  20]
 [   0]
 [ 196]
 [  20]
 [ 689]
 [  24]
 [4857]
 [6523]
 [   0]
 [   1]]
[[  24]
 [1218]
 [ 972]
 [ 473]
 [  33]
 [   0]
 [1218]
 [ 196]
 [4857]
 [6523]
 [ 689]
 [  24]
 [8197]
 [   0]
 [  19]
 [   0]
 [  20]
 [   0]
 [ 920]
 [7819]
 [  24]
 [  33]
 [ 972]
 [ 196]
 [4857]
 [8440]
 [   0


 30%|██▉       | 2424/8161 [00:02<00:07, 796.14it/s][A
 31%|███       | 2520/8161 [00:03<00:06, 837.81it/s][A

[[4135]
 [7819]
 [8289]
 [   0]
 [4135]
 [ 473]
 [  33]
 [ 473]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [1860]
 [  24]
 [  33]
 [ 473]
 [4857]
 [ 972]
 [  20]
 [   0]
 [8289]
 [1218]
 [   0]
 [2521]
 [  24]
 [  33]
 [8197]
 [   0]
 [  35]
 [ 473]
 [ 689]
 [ 196]
 [ 473]
 [6069]
 [ 473]
 [6523]
 [   0]
 [ 972]
 [8289]
 [   0]
 [  35]
 [ 473]
 [   0]
 [   1]]
[[4135]
 [7819]
 [ 473]
 [4857]
 [   0]
 [4135]
 [ 473]
 [  33]
 [ 473]
 [   0]
 [ 920]
 [8289]
 [4857]
 [ 972]
 [  33]
 [  24]
 [6523]
 [ 196]
 [ 920]
 [ 972]
 [8289]
 [  33]
 [8197]
 [   0]
 [6069]
 [ 196]
 [ 473]
 [4135]
 [  20]
 [   0]
 [ 972]
 [8289]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [  35]
 [ 473]
 [ 689]
 [ 196]
 [ 473]
 [1218]
 [   0]
 [8289]
 [1218]
 [   0]
 [2521]
 [  24]
 [  33]
 [8197]
 [   0]
 [  19]
 [   0]
 [  20]
 [   0]
 [1860]
 [  24]
 [  33]
 [ 473]
 [4857]
 [ 972]
 [  24]
 [8440]
 [ 473]
 [   0]
 [  20]
 [ 972]
 [  33]
 [ 336]
 [ 920]
 [5491]
 [   0]
 [6523]
 [8289]
 [4135]
 [4857]
 [   0]
 [   1]]
[[4135


 32%|███▏      | 2625/8161 [00:03<00:06, 891.83it/s][A
 33%|███▎      | 2717/8161 [00:03<00:06, 848.18it/s][A


[[4135]
 [7819]
 [ 473]
 [  33]
 [ 473]
 [   0]
 [ 196]
 [  20]
 [   0]
 [  20]
 [ 336]
 [  33]
 [ 486]
 [7819]
 [8197]
 [5491]
 [   0]
 [ 336]
 [  20]
 [ 473]
 [6523]
 [   0]
 [   1]]
[[4135]
 [7819]
 [ 473]
 [  33]
 [ 473]
 [   0]
 [ 196]
 [  20]
 [   0]
 [ 972]
 [  33]
 [  24]
 [  20]
 [ 196]
 [  24]
 [4857]
 [5491]
 [  24]
 [   0]
 [ 336]
 [  20]
 [ 473]
 [6523]
 [   0]
 [   1]]
[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [4135]
 [  24]
 [  20]
 [   0]
 [  19]
 [   0]
 [ 689]
 [ 196]
 [ 972]
 [ 972]
 [ 689]
 [ 473]
 [   0]
 [  33]
 [ 336]
 [  20]
 [  20]
 [ 196]
 [  24]
 [4857]
 [   0]
 [  19]
 [   1]]
[[ 4135]
 [ 7819]
 [   24]
 [  972]
 [    0]
 [  689]
 [   24]
 [ 4857]
 [ 8440]
 [  336]
 [   24]
 [ 8440]
 [  473]
 [    0]
 [ 6523]
 [ 8289]
 [  473]
 [   20]
 [    0]
 [  972]
 [ 7819]
 [  473]
 [    0]
 [   33]
 [  336]
 [   20]
 [   20]
 [  196]
 [   24]
 [ 4857]
 [    0]
 [ 8289]
 [   33]
 [  972]
 [ 7819]
 [ 8289]
 [ 6523]
 [ 8289]
 [10460]
 [    0]
 [  920]
 [ 7819]
 [  336]
 


 34%|███▍      | 2804/8161 [00:03<00:06, 825.96it/s][A
 35%|███▌      | 2889/8161 [00:03<00:06, 799.37it/s][A


[[ 4135]
 [ 7819]
 [   24]
 [  972]
 [    0]
 [  972]
 [ 8197]
 [ 1860]
 [  473]
 [    0]
 [ 8289]
 [ 1218]
 [    0]
 [   24]
 [  196]
 [   33]
 [  920]
 [   33]
 [   24]
 [ 1218]
 [  972]
 [    0]
 [  196]
 [   20]
 [    0]
 [ 4857]
 [ 8289]
 [  972]
 [    0]
 [   24]
 [ 1860]
 [ 1860]
 [   33]
 [ 8289]
 [ 1860]
 [   33]
 [  196]
 [   24]
 [  972]
 [  473]
 [    0]
 [  972]
 [ 8289]
 [    0]
 [   35]
 [  473]
 [    0]
 [  336]
 [   20]
 [  473]
 [ 6523]
 [    0]
 [  196]
 [ 4857]
 [    0]
 [  920]
 [  689]
 [ 8289]
 [   20]
 [  473]
 [    0]
 [ 1860]
 [   33]
 [ 8289]
 [10460]
 [  196]
 [ 2521]
 [  196]
 [  972]
 [ 8197]
 [    0]
 [ 8289]
 [ 1218]
 [    0]
 [  336]
 [   20]
 [    0]
 [ 8440]
 [   33]
 [ 8289]
 [  336]
 [ 4857]
 [ 6523]
 [    0]
 [  972]
 [   33]
 [ 8289]
 [ 8289]
 [ 1860]
 [   20]
 [    0]
 [    1]]
[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [  24]
 [  33]
 [ 473]
 [   0]
 [2521]
 [8289]
 [  20]
 [ 972]
 [   0]
 [  24]
 [ 972]
 [ 972]
 [  24]
 [ 920]
 [5491]
 [   0]
 [


 37%|███▋      | 2986/8161 [00:03<00:06, 842.73it/s][A
 38%|███▊      | 3072/8161 [00:03<00:06, 799.80it/s][A

[[4135]
 [7819]
 [ 473]
 [  33]
 [ 473]
 [   0]
 [7819]
 [  24]
 [6069]
 [ 473]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [1860]
 [8289]
 [4135]
 [ 473]
 [  33]
 [  20]
 [   0]
 [2521]
 [  24]
 [ 196]
 [4857]
 [ 972]
 [  24]
 [ 196]
 [4857]
 [ 473]
 [6523]
 [   0]
 [1860]
 [ 473]
 [  24]
 [ 920]
 [ 473]
 [   0]
 [ 196]
 [4857]
 [   0]
 [  33]
 [ 473]
 [ 920]
 [ 473]
 [4857]
 [ 972]
 [   0]
 [8197]
 [ 473]
 [  24]
 [  33]
 [  20]
 [   0]
 [   1]]
[[4135]
 [7819]
 [  24]
 [ 972]
 [   0]
 [4135]
 [  24]
 [  20]
 [   0]
 [ 972]
 [7819]
 [ 473]
 [   0]
 [ 689]
 [  24]
 [  20]
 [ 972]
 [   0]
 [ 920]
 [  33]
 [ 196]
 [  20]
 [ 196]
 [  20]
 [   0]
 [6523]
 [ 336]
 [  33]
 [ 196]
 [4857]
 [8440]
 [   0]
 [ 920]
 [8289]
 [ 689]
 [6523]
 [   0]
 [4135]
 [  24]
 [  33]
 [   0]
 [ 972]
 [8289]
 [   0]
 [ 473]
 [  20]
 [ 920]
 [  24]
 [ 689]
 [  24]
 [ 972]
 [ 473]
 [   0]
 [4135]
 [8289]
 [  33]
 [ 689]
 [6523]
 [   0]
 [1860]
 [8289]
 [4135]
 [ 473]
 [  33]
 [  20]
 [   0]
 [   1]]
[[4135]
 [7819

KeyboardInterrupt: ignored

In [0]:
def draw_results(question, possible_answers, predicted_index, correct_indices):
    print("Q:", question, end='\n\n')
    for i, answer in enumerate(possible_answers):
        print("#%i: %s %s" % (i, '[*]' if i == predicted_index else '[ ]', answer))
    
    print("\nVerdict:", "CORRECT" if predicted_index in correct_indices else "INCORRECT", 
          "(ref: %s)" % correct_indices, end='\n' * 3)

In [0]:
for i in [1, 100, 1000, 2000, 3000, 4000, 5000]:
    draw_results(test.iloc[i].question, test.iloc[i].options,
                 predicted_answers[i], test.iloc[i].correct_indices)

In [0]:
question = "What is my name?" # your question here!
possible_answers = [
    <...> 
    # ^- your options. 
]
predicted answer = select_best_answer(question, possible_answers)

draw_results(question, possible_answers,
             predicted_answer, [0])

### Bonus tasks

There are many ways to improve our question answering model. Here's a bunch of things you can do to increase your understanding and get bonus points.


### 0. Fine-tuning (3+ pts)
This time our dataset is fairly small. We can improve the training procedure by starting with a pre-trained model.
* The simplest option is to use pre-trained embeddings. See previous weeks for that.
* A harder (but better) alternative is to use a pre-trained sentence encoder. Consider [InferSent](https://github.com/facebookresearch/InferSent), Universal Sentence Encoder or ELMO.


### 1.  Hard Negatives (3+ pts)

Not all wrong answers are equally wrong. As the training progresses, _most negative examples $a^-$ will be to easy._ So easy in fact, that loss function and gradients on such negatives is exactly __0.0__. To improve training efficiency, one can __mine hard negative samples__.

Given a list of answers,
* __Hard negative__ is the wrong answer with highest similarity with question,

$$a^-_{hard} = \underset {a^-} {argmax} \space sim[V_q(q), V_a(a^-)]$$

* __Semi-hard negative__ is the one with highest similarity _among wrong answers that are farther than positive one. This option is more useful if some wrong answers may actually be mislabelled correct answers.

* One can also __sample__ negatives proportionally to $$P(a^-_i) \sim e ^ {sim[V_q(q), V_a(a^-_i)]}$$


The task is to implement at least __hard negative__ sampling and apply it for model training.


### 2. Bring Your Own Model (3+ pts)
In addition to Universal Sentence Encoder, one can also train a new model.
* You name it: convolutions, RNN, self-attention
* Use pre-trained ELMO or FastText embeddings
* Monitor overfitting and use dropout / word dropout to improve performance

__Note:__ if you use ELMO please note that it requires tokenized text while USE can deal with raw strings. You can tokenize data manually or use tokenized=True when reading dataset.


* hard negatives (strategies: hardest, hardest farter than current, randomized)
* train model on the full dataset to see if it can mine answers to new questions over the entire wikipedia. Use approximate nearest neighbor search for fast lookup.


### 3. Search engine (3+ pts)

Our basic model only selects answers from 2-5 available sentences in paragraph. You can extend it to search over __the whole dataset__. All sentences in all other paragraphs are viable answers.

The goal is to train such a model and use it to __quickly find top-10 answers from the whole set__.

* You can ask such model a question of your own making - to see which answers it can find among the entire training dataset or even the entire wikipedia.
* Searching for top-K neighbors is easier if you use specialized methods: [KD-Tree](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KDTree.html) or [HNSW](https://github.com/nmslib/hnswlib). 
* This task is much easier to train if you use hard or semi-hard negatives. You can even find hard negatives for one question from correct answers to other questions in batch - do so in-graph for maximum efficiency. See [1.] for more details.
