# Question-Answer Dataset
This page provides a link to a corpus of Wikipedia articles, manually-generated factoid questions from them, and manually-generated answers to these questions, for use in academic research. These data were collected by Noah Smith, Michael Heilman, Rebecca Hwa, Shay Cohen, Kevin Gimpel, and many students at Carnegie Mellon University and the University of Pittsburgh between 2008 and 2010.

## Download
Manually-generated factoid question/answer pairs with difficulty ratings from Wikipedia articles. Dataset includes articles, questions, and answers.
Version 1.2 released August 23, 2013 (same data as 1.1, but now released under GFDL and CC BY-SA 3.0)
README.v1.2; Question_Answer_Dataset_v1.2.tar.gz( http://www.cs.cmu.edu/~ark/QA-data/data/Question_Answer_Dataset_v1.2.tar.gz)

In [1]:
%matplotlib inline
from __future__ import unicode_literals, print_function, division
from io import open
import unicodedata
import string
import re
import random

import torch
import torch.nn as nn
from torch import optim
import torch.nn.functional as F

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")



In [2]:

!wget http://www.cs.cmu.edu/~ark/QA-data/data/Question_Answer_Dataset_v1.2.tar.gz



--2021-12-18 16:43:10--  http://www.cs.cmu.edu/~ark/QA-data/data/Question_Answer_Dataset_v1.2.tar.gz
Resolving www.cs.cmu.edu (www.cs.cmu.edu)... 128.2.42.95
Connecting to www.cs.cmu.edu (www.cs.cmu.edu)|128.2.42.95|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8254496 (7.9M) [application/x-gzip]
Saving to: ‘Question_Answer_Dataset_v1.2.tar.gz’


2021-12-18 16:43:13 (3.16 MB/s) - ‘Question_Answer_Dataset_v1.2.tar.gz’ saved [8254496/8254496]



In [3]:
!tar -zxvf Question_Answer_Dataset_v1.2.tar.gz

Question_Answer_Dataset_v1.2/
Question_Answer_Dataset_v1.2/S08/
Question_Answer_Dataset_v1.2/S08/question_answer_pairs.txt
Question_Answer_Dataset_v1.2/S08/data/
Question_Answer_Dataset_v1.2/S08/data/set4/
Question_Answer_Dataset_v1.2/S08/data/set4/a6.txt.clean
Question_Answer_Dataset_v1.2/S08/data/set4/a3.txt.clean
Question_Answer_Dataset_v1.2/S08/data/set4/a3.txt
Question_Answer_Dataset_v1.2/S08/data/set4/a5.txt
Question_Answer_Dataset_v1.2/S08/data/set4/a4o.htm
Question_Answer_Dataset_v1.2/S08/data/set4/a3.htm
Question_Answer_Dataset_v1.2/S08/data/set4/a9.htm
Question_Answer_Dataset_v1.2/S08/data/set4/a2.txt
Question_Answer_Dataset_v1.2/S08/data/set4/a9.txt.clean
Question_Answer_Dataset_v1.2/S08/data/set4/a4.htm
Question_Answer_Dataset_v1.2/S08/data/set4/a4.txt
Question_Answer_Dataset_v1.2/S08/data/set4/a4.txt.clean
Question_Answer_Dataset_v1.2/S08/data/set4/a2.htm
Question_Answer_Dataset_v1.2/S08/data/set4/a7o.htm
Question_Answer_Dataset_v1.2/S08/data/set4/a6.txt
Question_Answer_Da

In [4]:
input_file1 = '/content/Question_Answer_Dataset_v1.2/S10/question_answer_pairs.txt'
lines = open(input_file1, encoding='ISO-8859-1').read().strip().split('\n')
for line in lines[1:21]:
  #print(line)
  print(line.split('\t')[1:3])

['Was Alessandro Volta a professor of chemistry?', 'Alessandro Volta was not a professor of chemistry.']
['Was Alessandro Volta a professor of chemistry?', 'No']
['Did Alessandro Volta invent the remotely operated pistol?', 'Alessandro Volta did invent the remotely operated pistol.']
['Did Alessandro Volta invent the remotely operated pistol?', 'Yes']
['Was Alessandro Volta taught in public schools?', 'Volta was taught in public schools.']
['Was Alessandro Volta taught in public schools?', 'Yes']
['Who did Alessandro Volta marry?', 'Alessandro Volta married Teresa Peregrini.']
['Who did Alessandro Volta marry?', 'Teresa Peregrini']
['What did Alessandro Volta invent in 1800?', 'In 1800, Alessandro Volta invented the voltaic pile.']
['What did Alessandro Volta invent in 1800?', 'voltaic pile']
['What is the battery made by Alessandro Volta credited as?', 'The battery made by Volta is credited as the first electrochemical cell.']
['What is the battery made by Alessandro Volta credited as

### Input File Data format:
ArticleTitle\tQuestion\tAnswer\tDifficultyFromQuestioner\tDifficultyFromAnswerer\tArticleFile'

ArticleTitle - index=0
Question - index=1
Answer - index=2

Since this was Question & Answering model, we are intersted in index=1 and index=2 only



In [5]:

class Lang:
  def __init__(self, name):
    self.name = name
    self.word2index = {}
    self.word2count = {}
    self.index2word = {0: "SOS", 1 : "EOS"}
    self.n_words = 2

  def addSentence(self, sentence):
    for word in sentence.split(' '):
      self.addWord(word)

  def addWord(self, word):
    if word not in self.word2index:
      self.word2index[word] = self.n_words
      self.index2word[self.n_words] = word
      self.word2count[word] = 1
      self.n_words += 1
    else:
      self.word2count[word] += 1

# Turn a Unicode string to Plain ASCII. 
def unicodeToAscii(s):
  return ''.join(
      c for c in unicodedata.normalize('NFD', s)
      if unicodedata.category(c) != 'Mn'
  )

#lowercase, trim, and remove non-letter characters
def normalizeString(s):
  s = unicodeToAscii(s.lower().strip())
  s = re.sub(r"([.!?])", r" \1",s)
  s = re.sub(r"[^a-zA-Z.!?]+", r" ",s)       
  return s

def read_input_data(src, dest, root_dir, sub_dir, file_name, reverse=False):
  input_data_pairs = []

  for dir in sub_dir:
    path= '%s/%s/%s' %(root_dir,dir,file_name)
    print(path)

    # utf-8 format was causing reading error, so using ISO-8859-1 format.
    lines = open(path, encoding='ISO-8859-1').read().strip().split('\n')
    
    # Split every line into pairs and normalize
    # Column1: Question
    # Column2: Answer
    # Skip header from reading or 1st line.
    pairs = [[normalizeString(s) for s in line.split('\t')[1:3] ] for line in lines[1:]]
    input_data_pairs.extend(pairs)

  input_data = Lang(src)
  output_data = Lang(dest)

  return input_data, output_data, input_data_pairs

MAX_LENGTH = 10


def filterPair(p):
  print(p)
  return len(p[0].split(' ')) < MAX_LENGTH and \
    len(p[1].split(' ')) < MAX_LENGTH

def filterPairs(pairs):
  return [pair for pair in pairs if filterPair(pair)]

def prepare_data(src, dest, root_dir, sub_dir, file_name, reverse=False):
  
  input_data, output_data, pairs = read_input_data(src, dest, root_dir, sub_dir, file_name, True)

  print("Read %s sentence pairs" %len(pairs))
  pairs = filterPairs(pairs)
  
  print("Trimmed to %s sententce pairs" % len(pairs))
  
  print("Counting words...")
  for pair in pairs:
    input_data.addSentence(pair[0])
    output_data.addSentence(pair[1])
  print("counted words:")
  print(input_data.name, input_data.n_words)
  print(output_data.name, output_data.n_words)
  return input_data, output_data, pairs


In [6]:
input_file1 = '/content/Question_Answer_Dataset_v1.2/S08/question_answer_pairs.txt'
root_dir = '/content/Question_Answer_Dataset_v1.2'
sub_dir = ['S08', 'S09', 'S10']
file_name = 'question_answer_pairs.txt'
src = 'Question'
dest = 'Answer'

EOS_Token = 1
SOS_Token = 0


input_data, output_data, pairs = prepare_data(src, dest, root_dir, sub_dir, file_name, False)
print(random.choice(pairs))


/content/Question_Answer_Dataset_v1.2/S08/question_answer_pairs.txt
/content/Question_Answer_Dataset_v1.2/S09/question_answer_pairs.txt
/content/Question_Answer_Dataset_v1.2/S10/question_answer_pairs.txt
Read 3998 sentence pairs
['was abraham lincoln the sixteenth president of the united states ?', 'yes']
['was abraham lincoln the sixteenth president of the united states ?', 'yes .']
['did lincoln sign the national banking act of ?', 'yes']
['did lincoln sign the national banking act of ?', 'yes .']
['did his mother die of pneumonia ?', 'no']
['did his mother die of pneumonia ?', 'no .']
['how many long was lincoln s formal education ?', ' months']
['how many long was lincoln s formal education ?', ' months .']
['when did lincoln begin his political career ?', ' ']
['when did lincoln begin his political career ?', ' .']
['what did the legal tender act of establish ?', 'the united states note the first paper currency in united states history']
['what did the legal tender act of establis

The architecture we are building

![image](https://miro.medium.com/max/1838/1*tXchCn0hBSUau3WO0ViD7w.jpeg)

As we can see here, we will have an encoder, an attention mechanism block and decoder. In the final code the attention mechanicm block and decoder will be merged into single block as we need both to work together.

As we can see here, we need to create a copy of h1, h2, h3 and h4. These are encoder outputs for a sentence with 4 words.

Encoder
We will build our encoder with a GRU, but that's all we know. Let's NOT strait away build a class, but see how to come up with one for the Encoder. We need to answer few questions first:

what would be the hidden size of our GRU
What would be the input size
What would be the embedding dimesions.
For simplicity, lets keep 1. and 3. to be 256.

We can't feed our input directly to GRU, we need to tensorize it, convert to embeddings first.

embedding = nn.Embedding(input_size, hidden_size)

What is input_size?
Remember the line below?

input_lang, output_lang, pairs = prepareData('eng', 'fra', True)

In [7]:
class EncoderRNN(nn.Module):
    def __init__(self, input_size, hidden_size):
        super(EncoderRNN, self).__init__()
        self.hidden_size = hidden_size

        self.embedding = nn.Embedding(input_size, hidden_size)
        self.gru = nn.GRU(hidden_size, hidden_size)

    def forward(self, input, hidden):
        embedded = self.embedding(input).view(1, 1, -1)
        output = embedded
        output, hidden = self.gru(output, hidden)
        return output, hidden

    def initHidden(self):
        return torch.zeros(1, 1, self.hidden_size, device=device)

## Decoder

In [8]:
class AttnDecoderRNN(nn.Module):
    def __init__(self, hidden_size, output_size, dropout_p=0.1, max_length=MAX_LENGTH):
        super(AttnDecoderRNN, self).__init__()
        self.hidden_size = hidden_size
        self.output_size = output_size
        self.dropout_p = dropout_p
        self.max_length = max_length

        self.embedding = nn.Embedding(self.output_size, self.hidden_size)
        self.attn = nn.Linear(self.hidden_size * 2, self.max_length)
        self.attn_combine = nn.Linear(self.hidden_size * 2, self.hidden_size)
        self.dropout = nn.Dropout(self.dropout_p)
        self.gru = nn.GRU(self.hidden_size, self.hidden_size)
        self.out = nn.Linear(self.hidden_size, self.output_size)

    def forward(self, input, hidden, encoder_outputs):
        embedded = self.embedding(input).view(1, 1, -1)
        embedded = self.dropout(embedded)

        attn_weights = F.softmax(
            self.attn(torch.cat((embedded[0], hidden[0]), 1)), dim=1)
        attn_applied = torch.bmm(attn_weights.unsqueeze(0),
                                 encoder_outputs.unsqueeze(0))

        output = torch.cat((embedded[0], attn_applied[0]), 1)
        output = self.attn_combine(output).unsqueeze(0)

        output, hidden = self.gru(output, hidden)
        output = F.relu(output)

        output = F.log_softmax(self.out(output[0]), dim=1)
        return output, hidden, attn_weights

    def initHidden(self):
        return torch.zeros(1, 1, self.hidden_size, device=device)

In [9]:
def indexesFromSentence(lang, sentence):
    return [lang.word2index[word] for word in sentence.split(' ')]


def tensorFromSentence(lang, sentence):
    indexes = indexesFromSentence(lang, sentence)
    indexes.append(EOS_Token)
    return torch.tensor(indexes, dtype=torch.long, device=device).view(-1, 1)


def tensorsFromPair(pair):
    input_tensor = tensorFromSentence(input_data, pair[0])
    target_tensor = tensorFromSentence(output_data, pair[1])
    return (input_tensor, target_tensor)

In [10]:
teacher_forcing_ratio = 0.5


def train(input_tensor, target_tensor, encoder, decoder, encoder_optimizer, decoder_optimizer, criterion, max_length=MAX_LENGTH):
    encoder_hidden = encoder.initHidden()

    encoder_optimizer.zero_grad()
    decoder_optimizer.zero_grad()

    input_length = input_tensor.size(0)
    target_length = target_tensor.size(0)

    encoder_outputs = torch.zeros(max_length, encoder.hidden_size, device=device)

    loss = 0

    for ei in range(input_length):
        encoder_output, encoder_hidden = encoder(
            input_tensor[ei], encoder_hidden)
        encoder_outputs[ei] = encoder_output[0, 0]

    decoder_input = torch.tensor([[SOS_Token]], device=device)

    decoder_hidden = encoder_hidden

    use_teacher_forcing = True if random.random() < teacher_forcing_ratio else False

    if use_teacher_forcing:
        # Teacher forcing: Feed the target as the next input
        for di in range(target_length):
            decoder_output, decoder_hidden, decoder_attention = decoder(
                decoder_input, decoder_hidden, encoder_outputs)
            loss += criterion(decoder_output, target_tensor[di])
            decoder_input = target_tensor[di]  # Teacher forcing

    else:
        # Without teacher forcing: use its own predictions as the next input
        for di in range(target_length):
            decoder_output, decoder_hidden, decoder_attention = decoder(
                decoder_input, decoder_hidden, encoder_outputs)
            topv, topi = decoder_output.topk(1)
            decoder_input = topi.squeeze().detach()  # detach from history as input

            loss += criterion(decoder_output, target_tensor[di])
            if decoder_input.item() == EOS_Token:
                break

    loss.backward()

    encoder_optimizer.step()
    decoder_optimizer.step()

    return loss.item() / target_length

In [11]:
import time
import math


def asMinutes(s):
    m = math.floor(s / 60)
    s -= m * 60
    return '%dm %ds' % (m, s)


def timeSince(since, percent):
    now = time.time()
    s = now - since
    es = s / (percent)
    rs = es - s
    return '%s (- %s)' % (asMinutes(s), asMinutes(rs))

In [12]:
def trainIters(encoder, decoder, n_iters, print_every=1000, plot_every=100, learning_rate=0.01):
    start = time.time()
    plot_losses = []
    print_loss_total = 0  # Reset every print_every
    plot_loss_total = 0  # Reset every plot_every

    encoder_optimizer = optim.SGD(encoder.parameters(), lr=learning_rate)
    decoder_optimizer = optim.SGD(decoder.parameters(), lr=learning_rate)
    training_pairs = [tensorsFromPair(random.choice(pairs))
                      for i in range(n_iters)]
    criterion = nn.NLLLoss()

    for iter in range(1, n_iters + 1):
        training_pair = training_pairs[iter - 1]
        input_tensor = training_pair[0]
        target_tensor = training_pair[1]

        loss = train(input_tensor, target_tensor, encoder,
                     decoder, encoder_optimizer, decoder_optimizer, criterion)
        print_loss_total += loss
        plot_loss_total += loss

        if iter % print_every == 0:
            print_loss_avg = print_loss_total / print_every
            print_loss_total = 0
            print('%s (%d %d%%) %.4f' % (timeSince(start, iter / n_iters),
                                         iter, iter / n_iters * 100, print_loss_avg))

        if iter % plot_every == 0:
            plot_loss_avg = plot_loss_total / plot_every
            plot_losses.append(plot_loss_avg)
            plot_loss_total = 0

    showPlot(plot_losses)

In [13]:
import matplotlib.pyplot as plt
plt.switch_backend('agg')
import matplotlib.ticker as ticker
import numpy as np


def showPlot(points):
    plt.figure()
    fig, ax = plt.subplots()
    # this locator puts ticks at regular intervals
    loc = ticker.MultipleLocator(base=0.2)
    ax.yaxis.set_major_locator(loc)
    plt.plot(points)

In [14]:
def evaluate(encoder, decoder, sentence, max_length=MAX_LENGTH):
    with torch.no_grad():
        input_tensor = tensorFromSentence(input_data, sentence)
        input_length = input_tensor.size()[0]
        encoder_hidden = encoder.initHidden()

        encoder_outputs = torch.zeros(max_length, encoder.hidden_size, device=device)

        for ei in range(input_length):
            encoder_output, encoder_hidden = encoder(input_tensor[ei],
                                                     encoder_hidden)
            encoder_outputs[ei] += encoder_output[0, 0]

        decoder_input = torch.tensor([[SOS_Token]], device=device)  # SOS

        decoder_hidden = encoder_hidden

        decoded_words = []
        decoder_attentions = torch.zeros(max_length, max_length)

        for di in range(max_length):
            decoder_output, decoder_hidden, decoder_attention = decoder(
                decoder_input, decoder_hidden, encoder_outputs)
            decoder_attentions[di] = decoder_attention.data
            topv, topi = decoder_output.data.topk(1)
            if topi.item() == EOS_Token:
                decoded_words.append('<EOS>')
                break
            else:
                decoded_words.append(output_data.index2word[topi.item()])

            decoder_input = topi.squeeze().detach()

        return decoded_words, decoder_attentions[:di + 1]

In [15]:
def evaluateRandomly(encoder, decoder, n=10):
    for i in range(n):
        pair = random.choice(pairs)
        print('>', pair[0])
        print('=', pair[1])
        output_words, attentions = evaluate(encoder, decoder, pair[0])
        output_sentence = ' '.join(output_words)
        print('<', output_sentence)
        print('')

In [16]:
hidden_size = 256
encoder1 = EncoderRNN(input_data.n_words, hidden_size).to(device)
attn_decoder1 = AttnDecoderRNN(hidden_size, output_data.n_words, dropout_p=0.1).to(device)

#trainIters(encoder1, attn_decoder1, 75000, print_every=5000)
trainIters(encoder1, attn_decoder1, 25000, print_every=5000)

2m 24s (- 9m 38s) (5000 20%) 2.4813
4m 43s (- 7m 5s) (10000 40%) 2.0292
7m 5s (- 4m 43s) (15000 60%) 1.7695
9m 28s (- 2m 22s) (20000 80%) 1.6152
11m 51s (- 0m 0s) (25000 100%) 1.4508


In [17]:
evaluateRandomly(encoder1, attn_decoder1)

> how many domestic tourists visit melbourne ?
=  . million domestic visitors
<  . <EOS>

> did lincoln win the election of ?
= yes
< yes <EOS>

> what european countries established states in ghana ?
= the uk
< christian <EOS>

> what eves ?
= null
< null <EOS>

> did amedeo avogadro graduate ?
= yes
< yes <EOS>

> what is the official religion in the country ?
= null
< null <EOS>

> what is the official language of romania ?
= romanian .
<  . <EOS>

> did avogadro submit his poem ?
= yes .
< yes . <EOS>

> what does vitula mean ?
= stringed instrument
< null <EOS>

> how many strings does a violin usually have ?
=  
<  <EOS>



# Evaluation Metrics:

In [22]:
def get_actual_predicted_values(encoder, decoder, n=1000):
  target_values, predicted_values = [], []
  for i in range(n):
    pair = random.choice(pairs)
    output_words, attentions = evaluate(encoder, decoder, pair[0])
    predicted_out = ' '.join(output_words[:-1])    
    target_values.append(pair[1])
    predicted_values.append(predicted_out)    
  return target_values, predicted_values

target_values, predicted_values = get_actual_predicted_values(encoder1, attn_decoder1)



In [23]:
# https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html
from sklearn.metrics import precision_recall_fscore_support

def get_true_predicted_lables(target_values, predicted_values):
  y_true, y_pred = [], []

  for t, p in zip(target_values, predicted_values):
    y_true.append(0)
    if(t==p):
      y_pred.append(0)
    else:
      y_pred.append(1)
  return y_true, y_pred

y_true, y_pred = get_true_predicted_lables(target_values, predicted_values)
print(precision_recall_fscore_support(y_true, y_pred, average='macro', zero_division=0))

(0.5, 0.2395, 0.323867478025693, None)


### Evaluation metrics:
precision=50% <br>
recall = 23.95 % <br>
F1 score = 32.38 % <br>

#BELU

In [24]:
# https://pytorch.org/text/stable/data_metrics.html
from torchtext.data.metrics import bleu_score

def get_actual_predicted_values(target, predicted):
  candidate_corpus, references_corpus = [], []

  for t, p in zip(target, predicted):
    candidate_corpus.append(t.split(" "))
    references_corpus.append([p.split(" ")])
  
  return candidate_corpus, references_corpus

candidate_corpus, references_corpus = get_actual_predicted_values(target_values, predicted_values)
bleu_score(candidate_corpus, references_corpus)

0.059832289814949036

In [25]:
print(candidate_corpus[:5])
print(references_corpus[:5])

[['chemist', 'and', 'physicist'], ['null'], ['yes'], ['hassan', 'massoudy'], ['null']]
[[['a', 'and']], [['null']], [['yes']], [['null']], [['null']]]


#BERT Score

In [26]:
!pip install bert-score

from bert_score import BERTScorer
bert_scorer = BERTScorer(lang="en", rescale_with_baseline=True)

Collecting bert-score
  Downloading bert_score-0.3.11-py3-none-any.whl (60 kB)
[?25l[K     |█████▌                          | 10 kB 19.7 MB/s eta 0:00:01[K     |███████████                     | 20 kB 25.7 MB/s eta 0:00:01[K     |████████████████▍               | 30 kB 12.8 MB/s eta 0:00:01[K     |█████████████████████▉          | 40 kB 9.6 MB/s eta 0:00:01[K     |███████████████████████████▎    | 51 kB 5.0 MB/s eta 0:00:01[K     |████████████████████████████████| 60 kB 3.4 MB/s 
Collecting transformers>=3.0.0numpy
  Downloading transformers-4.14.1-py3-none-any.whl (3.4 MB)
[K     |████████████████████████████████| 3.4 MB 12.5 MB/s 
Collecting sacremoses
  Downloading sacremoses-0.0.46-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 56.3 MB/s 
[?25hCollecting tokenizers<0.11,>=0.10.1
  Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
[K     |███████████████

Downloading:   0%|          | 0.00/482 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at roberta-large were not used when initializing RobertaModel: ['lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.bias']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [31]:
#References:
# https://torchmetrics.readthedocs.io/en/latest/references/modules.html?highlight=BERTScore#bertscore
# https://towardsdatascience.com/machine-translation-evaluation-with-sacrebleu-and-bertscore-d7fdb0c47eb3


def get_target_predicted_vals(target, predicted):
  hyps, refs = [], []

  for t, p in zip(target, predicted):
    hyps.append(t)
    refs.append([p])
  
  return hyps, refs

hyps, refs = get_target_predicted_vals(target_values, predicted_values)
P, R, F1 = bert_scorer.score(hyps, refs)

print("Precision={}, Recall={}, F1-score={}".format(P.mean()*100, R.mean()*100, F1.mean()))




Precision=13.805421829223633, Recall=28.832426071166992, F1-score=0.20973071455955505




# Perplexity

In [33]:
# https://huggingface.co/docs/transformers/perplexity

import math

def trainIters(encoder, decoder, n_iters, print_every=1000, plot_every=100, learning_rate=0.01):
    start = time.time()
    plot_losses = []
    print_loss_total = 0  # Reset every print_every
    plot_loss_total = 0  # Reset every plot_every

    encoder_optimizer = optim.SGD(encoder.parameters(), lr=learning_rate)
    decoder_optimizer = optim.SGD(decoder.parameters(), lr=learning_rate)
    training_pairs = [tensorsFromPair(random.choice(pairs))
                      for i in range(n_iters)]

    criterion = nn.CrossEntropyLoss()

    for iter in range(1, n_iters + 1):
        training_pair = training_pairs[iter - 1]
        input_tensor = training_pair[0]
        target_tensor = training_pair[1]

        loss = train(input_tensor, target_tensor, encoder,
                     decoder, encoder_optimizer, decoder_optimizer, criterion)
        print_loss_total += loss
        plot_loss_total += loss

        if iter % print_every == 0:
            print_loss_avg = print_loss_total / print_every
            print_loss_total = 0
            print('%s | (%d %d%%) | AVG_LOSS = %.4f | PPL = %7.3f |'  % (
                timeSince(start, iter / n_iters), iter, 
                iter / n_iters * 100, print_loss_avg, 
                math.exp(print_loss_avg))
            )
        if iter % plot_every == 0:
            plot_loss_avg = plot_loss_total / plot_every
            plot_losses.append(plot_loss_avg)
            plot_loss_total = 0

    showPlot(plot_losses)


encoder1 = EncoderRNN(input_data.n_words, hidden_size).to(device)
attn_decoder1 = AttnDecoderRNN(
    hidden_size,  output_data.n_words, 
    dropout_p=0.1).to(device)

trainIters(encoder1, attn_decoder1, 25000, print_every=5000)

3m 6s (- 12m 27s) | (5000 20%) | AVG_LOSS = 2.4319 | PPL =  11.381 |
5m 40s (- 8m 30s) | (10000 40%) | AVG_LOSS = 2.0396 | PPL =   7.688 |
8m 16s (- 5m 30s) | (15000 60%) | AVG_LOSS = 1.8178 | PPL =   6.158 |
10m 52s (- 2m 43s) | (20000 80%) | AVG_LOSS = 1.6376 | PPL =   5.143 |
13m 34s (- 0m 0s) | (25000 100%) | AVG_LOSS = 1.5131 | PPL =   4.541 |
