# **8. Question Answeting**

# Loading Dataset


Question
1. What Type of Dataset in Question Answering ?
2. How to Solve ? Reading Comprehension
3. Agendas : Reading Comprehension Task
4. Agendas : Fine-Tuning Pre-Trained Model + Prefix Tuning + Prompt Tuning
4. Comparing Different Approach on

Steps :    

0. Task Brief
1. Loading + Cleaning Fine-Tuning Dataset
2. Model Architecture Change
3. Fine - Tuning Strategies
4. Calculate Metrics based on SQuAD evaluation

# Task Brief

## Dataset : SQuAD Dataset


there are lot variety of dataset that we can use to fine tune on Question Answering task,

you can browse [here](https://huggingface.co/datasets?task_categories=task_categories:question-answering&sort=trending)

Here is a sample from SQuAD Dataset



```
Context
Architecturally, the school has a Catholic character. Atop the Main Building's gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.
```








```
question
To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?
```





```
answers
{'text': ['Saint Bernadette Soubirous'], 'answer_start': [515]}
```



Each Training Pairs contains :     
- Context / Passage
- Question
- Answer

In [1]:
# Load this library
import numpy as np
import pandas as pd


import math
import os
import re
from typing import Tuple

import torch
import torch.nn.functional as F
from torch import nn, Tensor

from torchtext import datasets

from tqdm import tqdm
import time

import matplotlib.pyplot as plt



# Loading & Cleaning Dataset

## Loading SQuAD Dataset

In [2]:
import torch
from torch.utils.data import Dataset
from datasets import load_dataset


squad_train = load_dataset("squad", split="train[:800]")

squad_valtest= load_dataset("squad", split="validation[:200]")
# split the remaining for test
squad_valtest = squad_valtest.train_test_split(test_size=0.5)

squad_val = squad_valtest['train']
squad_test = squad_valtest['test']

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
squad_train['context'][0]

'Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.'

In [4]:
squad_train['question'][0]

'To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?'

In [5]:
squad_train['answers'][0]

{'text': ['Saint Bernadette Soubirous'], 'answer_start': [515]}

In [6]:
squad_train.shape

(800, 5)

In [7]:
print(f"Train SQuAD size {squad_train.shape} "
    f"| Val SQuAD size {squad_val.shape}"
      f"| Test SQuAD size {squad_test.shape}"
      )


Train SQuAD size (800, 5) | Val SQuAD size (100, 5)| Test SQuAD size (100, 5)


That's quite bigh enough dataset, for now we will scale it only by using , 10% from both training and test

So in order to use the dataset for training, what input spesification should look like ?


- In language model usually we have context --> however in this case we have two information --> `Passage,Questions` , hence we need to combine both as `input`

- After that we perform tokenization on input

### Processing Dataset

In [8]:
example = squad_train[0]

In [9]:
example

{'id': '5733be284776f41900661182',
 'title': 'University_of_Notre_Dame',
 'context': 'Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.',
 'question': 'To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?',
 'answers': {'text': ['Saint Bernadette Soubirous'], 'answer_start': [515]}}

In [10]:
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
tokenizer.add_special_tokens({'pad_token': tokenizer.eos_token})

1

### Concatenanting Passage + Questions

In [11]:
input = example['context'] + example['question']
input

'Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?'

In [12]:
len(input)

766

Okay, let's see how much maximum length of our input in squad dataset

Look's like we have maximum input exceed our model context length, GPT2 Context length is `1024`, what we need to do exactly ? Truncate it

### 1. Tokenizing Concatenated Passage + Questions

In [13]:
input = tokenizer(
        example["question"],
        example["context"],
        max_length=384,
        truncation="only_second",
        return_offsets_mapping=True,
        padding="max_length",
    )

Let's see what tokenized text looks like

In [14]:
tokenizer.decode(input['input_ids'])

'To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.<|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext

Most of the tokenized text, contain `[PAD]` token, we need to mask them during loss calculation

How the answer should look like, since we want to perform supervised learning --> we need to provide the model what is the correct answer,

Back to our task again , is classifying whether each token is `<start>` or `<end>` of answer .

### 2. Collect Start & End Position of each Answers

Now

In [15]:
def preprocess_function(examples):
    questions = [q.strip() for q in examples["question"]]
    inputs = tokenizer(
        questions,
        examples["context"],
        max_length=384,
        truncation="only_second",
        return_offsets_mapping=True,
        padding="max_length",
    )

    offset_mapping = inputs.pop("offset_mapping")
    answers = examples["answers"]
    start_positions = []
    end_positions = []

    for i, offset in enumerate(offset_mapping):

        answer = answers[i]

        start_char = answer["answer_start"][0]
        end_char = answer["answer_start"][0] + len(answer["text"][0])
        sequence_ids = inputs.sequence_ids()
        # Find the start and end of the context
        idx = 0
        while sequence_ids[idx] != 1:
            idx += 1
        context_start = idx
        while sequence_ids[idx] == 1:
            idx += 1
        context_end = idx - 1

        # If the answer is not fully inside the context, label it (0, 0)
        if offset[context_start][0] > end_char or offset[context_end][1] < start_char:
            start_positions.append(0)
            end_positions.append(0)
        else:
            # Otherwise it's the start and end token positions
            idx = context_start
            while idx <= context_end and offset[idx][0] <= start_char:
                idx += 1
            start_positions.append(idx - 1)

            idx = context_end
            while idx >= context_start and offset[idx][1] >= end_char:
                idx -= 1
            end_positions.append(idx + 1)

    inputs["start_positions"] = start_positions
    inputs["end_positions"] = end_positions
    return inputs

In [16]:
squad_train_tokenized = squad_train.map(preprocess_function, batched=True, remove_columns=['id', 'title','context','question', 'answers']).with_format("torch")
squad_val_tokenized = squad_val.map(preprocess_function, batched=True, remove_columns=['id', 'title','context','question', 'answers']).with_format("torch")
squad_test_tokenized = squad_test.map(preprocess_function, batched=True, remove_columns=['id', 'title','context','question', 'answers']).with_format("torch")

Map: 100%|██████████| 100/100 [00:00<00:00, 2710.92 examples/s]
Map: 100%|██████████| 100/100 [00:00<00:00, 3101.70 examples/s]


In [17]:
squad_train_tokenized

Dataset({
    features: ['input_ids', 'attention_mask', 'start_positions', 'end_positions'],
    num_rows: 800
})

In [18]:
from torch.utils.data import DataLoader
BATCH_SIZE = 8
train_loader = DataLoader(squad_train_tokenized,batch_size=BATCH_SIZE)
val_loader = DataLoader(squad_val_tokenized,batch_size=BATCH_SIZE)
test_laoder = DataLoader(squad_test_tokenized,batch_size=BATCH_SIZE)

In [19]:
# check sampling

for idx,x in enumerate(train_loader) :
  print(f'Iter {idx+1}')
  print(x['input_ids'])

Iter 1
tensor([[ 2514,  4150,   750,  ..., 50256, 50256, 50256],
        [ 2061,   318,   287,  ..., 50256, 50256, 50256],
        [  464, 32520,  3970,  ..., 50256, 50256, 50256],
        ...,
        [ 2215,   750,   262,  ..., 50256, 50256, 50256],
        [ 2437,  1690,   318,  ..., 50256, 50256, 50256],
        [ 2061,   318,   262,  ..., 50256, 50256, 50256]])
Iter 2
tensor([[ 2437,   867,  3710,  ..., 50256, 50256, 50256],
        [  818,   644,   614,  ..., 50256, 50256, 50256],
        [ 8496,   318,   262,  ..., 50256, 50256, 50256],
        ...,
        [ 2061,  3925,  2107,  ..., 50256, 50256, 50256],
        [13828, 11596,   750,  ..., 50256, 50256, 50256],
        [ 2437,   867, 24218,  ..., 50256, 50256, 50256]])
Iter 3
tensor([[  818,   644,   614,  ..., 50256, 50256, 50256],
        [ 8421,   262,  6282,  ..., 50256, 50256, 50256],
        [ 2437,   867, 13346,  ..., 50256, 50256, 50256],
        ...,
        [ 2437,   867, 15273,  ..., 50256, 50256, 50256],
        [ 

# Model Architecture Change

## Load Base Model

To perform fine-tuning process we usually have a base model / pre-trained model which learned from these objectives :     
- Mask Language Modelling / Next Word Prediction / Span Corruption

We load these components :    
1. Pretrained Model
2. Trained Tokenizer

### Hide

In [20]:
from transformers import AutoTokenizer, AutoModelForCausalLM
gpt = AutoModelForCausalLM.from_pretrained("distilgpt2")
tokenizer = AutoTokenizer.from_pretrained("distilgpt2")


In [21]:
class DistilledGPT2ForQA(nn.Module) :
    def __init__(self,pretrained_model,device) :
        super().__init__()
        # take last layer / before logit
        self.pretrained_model = pretrained_model.transformer
        # take pretrained_model embedding dimension
        self.n_dim  = 768
        # qa_outputs out_features --> 2 , the label is start and end span
        self.n_outputs = 2
        self.qa_outputs = nn.Linear(self.n_dim,self.n_outputs)
        self.device = device


    def forward(self,input_ids= None,
        attention_mask= None,
        token_type_ids= None,
        position_ids=None,
        head_mask=None,
        inputs_embeds=None,
        output_attentions= None,
        output_hidden_states= None,
        return_dict= None) :
      # feed to pretrained first
      B,T = input_ids.shape
      position_ids = torch.arange(0, T, dtype=torch.long, device=self.device).unsqueeze(0)
      x = self.pretrained_model(input_ids=input_ids,attention_mask=attention_mask,
                                token_type_ids=token_type_ids,position_ids=position_ids,
                                head_mask=head_mask,inputs_embeds=inputs_embeds,
                                output_attentions=output_attentions,
                                output_hidden_states=output_hidden_states,return_dict=return_dict)
      print(type(x))

      # get logit
      logit = self.qa_outputs(x)

      # return logit
      return logit

In [22]:
input.keys()

dict_keys(['input_ids', 'attention_mask', 'offset_mapping'])

### Appear

In [23]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

In [24]:
from transformers import AutoModelForQuestionAnswering

model_qa = AutoModelForQuestionAnswering.from_pretrained('distilgpt2').to(device)

Some weights of GPT2ForQuestionAnswering were not initialized from the model checkpoint at distilgpt2 and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [25]:
model_qa

GPT2ForQuestionAnswering(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-5): 6 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2SdpaAttention(
          (c_attn): Conv1D(nf=2304, nx=768)
          (c_proj): Conv1D(nf=768, nx=768)
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D(nf=3072, nx=768)
          (c_proj): Conv1D(nf=768, nx=3072)
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (qa_outputs): Linear(in_features=768, out_features=2, bias=True)
)

In [26]:
torch.cuda.empty_cache()

In [27]:
for train_samples in train_loader :
    print(train_samples)
    break

{'input_ids': tensor([[ 2514,  4150,   750,  ..., 50256, 50256, 50256],
        [ 2061,   318,   287,  ..., 50256, 50256, 50256],
        [  464, 32520,  3970,  ..., 50256, 50256, 50256],
        ...,
        [ 2215,   750,   262,  ..., 50256, 50256, 50256],
        [ 2437,  1690,   318,  ..., 50256, 50256, 50256],
        [ 2061,   318,   262,  ..., 50256, 50256, 50256]]), 'attention_mask': tensor([[1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        ...,
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0]]), 'start_positions': tensor([132,   0,   0,   0,   0,  64,  98, 125]), 'end_positions': tensor([139,   0,   0,   0,   0,  66,  98, 126])}


In [28]:
for train_samples in train_loader :
  input_ids = train_samples['input_ids'].to(device)
  print(input_ids.shape)
  # attention mask
  attention_mask = train_samples['attention_mask'].to(device)

  # load model
  logit = model_qa(input_ids=input_ids,attention_mask=attention_mask)

  break

torch.Size([8, 384])


Extract the logit score (last layer)

In [29]:
start_logit = logit.start_logits
end_logit = logit.end_logits

In [30]:
start_softmax = torch.nn.functional.softmax(start_logit,dim=-1)
end_softmax = torch.nn.functional.softmax(end_logit,dim=-1)

In [31]:
start_ans = torch.topk(start_softmax,k=1).indices
end_ans = torch.topk(end_softmax,k=1).indices

In [32]:
tokenizer.decode(start_ans[0,:])

"'"

In [33]:
tokenizer.decode(end_ans[0,:])

'P'

In [34]:
squad_train['context'][0]

'Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.'

Try one sample

### Retrain the model + Parameter Efficient Fine-Tuning

but before that we should create training function

In [35]:
!pip install peft accelerate

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




In [36]:
tokenizer.eos_token

'<|endoftext|>'

In [37]:
tokenizer.add_special_tokens({'pad_token': tokenizer.eos_token})
model_qa.config.pad_token_id = model_qa.config.eos_token_id

In [None]:
from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="finetuned_distil_gpt2",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=5,
    weight_decay=0.01,
)

trainer = Trainer(
    model=model_qa,
    args=training_args,
    train_dataset=squad_train_tokenized,
    eval_dataset=squad_val_tokenized,
    tokenizer=tokenizer
)

trainer.train()

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
  trainer = Trainer(


Epoch,Training Loss,Validation Loss
1,No log,3.178417
2,No log,2.542708


Save Model

In [62]:
trainer.save_model("content/finetuned_distil_gpt2")

In [63]:
model_finetuned =AutoModelForQuestionAnswering.from_pretrained('content/finetuned_distil_gpt2').to(device)

In [64]:
model_finetuned

GPT2ForQuestionAnswering(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-5): 6 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (qa_outputs): Linear(in_features=768, out_features=2, bias=True)
)

### Prediction

In [66]:
example = squad_test[0]

In [67]:
example

{'id': '56bf21b43aeaaa14008c9526',
 'title': 'Super_Bowl_50',
 'context': "The league announced on October 16, 2012, that the two finalists were Sun Life Stadium and Levi's Stadium. The South Florida/Miami area has previously hosted the event 10 times (tied for most with New Orleans), with the most recent one being Super Bowl XLIV in 2010. The San Francisco Bay Area last hosted in 1985 (Super Bowl XIX), held at Stanford Stadium in Stanford, California, won by the home team 49ers. The Miami bid depended on whether the stadium underwent renovations. However, on May 3, 2013, the Florida legislature refused to approve the funding plan to pay for the renovations, dealing a significant blow to Miami's chances.",
 'question': 'What was the last Super Bowl that took place at Sun Life Stadium in Miami? ',
 'answers': {'text': ['Super Bowl XLIV', 'Super Bowl XLIV', '2010'],
  'answer_start': [242, 242, 261]}}

In [68]:

inputs = tokenizer.encode_plus(example['question'],example['context'],return_tensors='pt').to(device)

In [93]:
inputs['input_ids'].shape

torch.Size([1, 149])

In [70]:
outputs = model_finetuned(**inputs)

In [71]:
answer_start = torch.argmax(outputs[0])
answer_end = torch.argmax(outputs[1]) + 1

In [91]:
start_logit = outputs.start_logits
start_logit.shape

torch.Size([1, 149])

In [73]:
end_logit = outputs.end_logits
end_logit

tensor([[ -0.0493,  -8.7085,  -8.1685,  -6.5768,  -7.2861,  -6.9952,  -8.5725,
          -6.7387, -10.6892,  -9.1734,  -8.0847,  -5.4625,  -3.7948,  -7.7111,
          -4.8642,  -7.6763,  -7.0149,  -8.7619,  -7.2607,  -9.7084,  -8.2131,
          -6.5272,  -5.7186,  -9.4201,  -3.6045,  -9.1746, -10.5124,  -8.9089,
          -6.4716,  -6.3749,  -9.6435,  -7.9720,  -5.8928,  -5.1505,  -9.8025,
          -5.4918,  -6.7729,  -5.2714,  -6.0076,  -7.5251,  -7.0350,  -7.5446,
          -9.4085,  -7.5396,  -6.6433,  -8.0638,  -8.1219,  -8.8850,  -8.6233,
          -5.8670,  -4.4821,  -4.4662,  -9.1933,  -9.0341, -10.2257, -10.6135,
          -8.0383, -11.3468,  -7.3806,  -5.6575,  -8.7674, -11.1200,  -9.4360,
          -7.4665,  -9.4113,  -9.3786, -10.4345,  -8.1279,  -8.4324,  -6.4169,
          -3.5528,  -8.8779,  -2.5434,  -6.1163,  -7.5707,  -6.2057,  -6.8929,
          -7.7318,  -4.8213,  -7.2300,  -8.8359,  -8.4083,  -2.1599,  -9.4574,
          -8.0993,  -7.5200,  -6.5956,  -4.3735,  -7

In [90]:
end_logit.shape

torch.Size([1, 149])

In [118]:
#extract position
answer_start = torch.topk(F.softmax(start_logit,dim=-1),k=1).indices.item()
answer_start

0

In [119]:
#extract position
answer_end = (torch.topk(F.softmax(end_logit,dim=-1),k=1).indices).item()
answer_end

0

In [116]:
ans_token = inputs['input_ids'][:,answer_start:answer_end+1].view(-1).tolist()
ans_token

[2061, 373]

In [117]:
answer = tokenizer.decode(ans_token)
answer

'What was'

[[2061]]

In [123]:
def generate_answer(context, question,model=model_finetuned,tokenizer=tokenizer):
  inputs = tokenizer.encode_plus(context,question,return_tensors='pt',
                                 truncation="only_second",return_offsets_mapping=False,
                                 padding="max_length").to(device)
  outputs = model(**inputs)
  start_logit = outputs.start_logits
  end_logit = outputs.end_logits
  answer_start = torch.topk(F.softmax(start_logit,dim=-1),k=1).indices.item()
  answer_end = torch.topk(F.softmax(end_logit,dim=-1),k=1).indices.item()
  ans_token = inputs['input_ids'][:,answer_start:answer_end+1].view(-1).tolist()
  answer  = tokenizer.decode(ans_token)

  return answer


In [124]:
squad_test_tokenized

Dataset({
    features: ['input_ids', 'attention_mask', 'start_positions', 'end_positions'],
    num_rows: 100
})

In [125]:
for i in range(10) :
  data = squad_test[i]
  context = data['context']
  question = data['question']
  ans = generate_answer(question=question,context=context,model=model_finetuned,tokenizer=tokenizer)
  print(ans)

The
For
The
The
The
The
The
In
Super
Super
