# **Question Answering System**


## **INTRODUCTION**


---




*   Being able to automatically answer questions accurately remains a difficult problem in natural language processing. ​

*   Question Answering is a branch of the Natural Language Understanding  field, and it aims to implement systems that, given a question in natural language, can extract relevant information from provided data and present it in the form of natural language answer.​



*   QA systems allow a user to express a question in natural language and get an immediate and brief response.​
*   QA systems are now found in search engines and phone conversational interfaces, and they’re fairly good at answering.​










## **DATASET**


---



There are three question files, one for each year of students: S08, S09, and S10, as well as 690,000 words worth of cleaned text from Wikipedia that was used to generate the questions.​



The "questionanswerpairs.txt" files contain both the questions and answers. The columns in this file are as follows:​

*   ArticleTitle: Name of the Wikipedia article from which questions and answers initially came.​
*   
Question: Question that need to be answered.​

*   Answer: Answer to the question.​

*   DifficultyFromQuestioner: Prescribed difficulty rating for the question as given to the question-writer.​
*   DifficultyFromAnswerer: Difficulty rating assigned by the individual who evaluated and answered the question, which may differ from the difficulty in field 4.​



*   ArticleFile: Name of the file with the relevant article.











​





In [None]:
#Mount the Google Driv

from google.colab import drive
drive.mount('/content/drive')

In [None]:
#Imported the "questionanswerpairs.txt" files which contain both the questions and answers and transform it into Dataframe.​

import pandas as pd

dataset = pd.read_csv("/content/drive/MyDrive/Question_Answer_Dataset_v1.1/S08/question_answer_pairs.txt", 
                      sep="\t",encoding= 'unicode_escape',on_bad_lines='skip')

#Remove all null answers and its corresponding questions and remove all the duplicate questions from the dataframe.​
dataset =dataset.dropna(axis=0)
dataset = dataset.drop_duplicates(subset='Question')
dataset.reset_index(inplace = True)

dataset.tail(10)

In [None]:
# Extract a list of questions from the questions column from dataframe and 
# its corresponding context using which we will run both the models and compare the accuracy. 


dataset = dataset[dataset['Answer'] != 'no' ]
dataset = dataset[dataset['Answer'] != 'yes' ]
dataset = dataset[dataset['Answer'] != 'No' ]
dataset = dataset[dataset['Answer'] != 'Yes' ]

df = dataset[dataset['ArticleFile'] == 'data/set3/a4' ]
df.reset_index(inplace = True)
display(df.head())
print(df.shape)
Questions_List = list(df['Question'])
c = "/content/drive/MyDrive/Question_Answer_Dataset_v1.1/S08/" + 'data/set3/a4' + ".txt"

with open(c) as file:
  lines = file.read()
context = lines
#print(context)

In [None]:
#List of Questions extracted from dataframe

display(Questions_List)

## Question Answering with Hugging Face models

Fine tuning Our First Hugging face Model -- **Distilbert-base-uncased** over adversarial_qa dataset

In [None]:
%pip install transformers


Collecting transformers
  Downloading transformers-4.18.0-py3-none-any.whl (4.0 MB)
[K     |████████████████████████████████| 4.0 MB 5.4 MB/s 
[?25hCollecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.5.1-py3-none-any.whl (77 kB)
[K     |████████████████████████████████| 77 kB 2.7 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 40.9 MB/s 
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
  Downloading tokenizers-0.11.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.5 MB)
[K     |████████████████████████████████| 6.5 MB 37.4 MB/s 
Collecting sacremoses
  Downloading sacremoses-0.0.49-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 40.2 MB/s 
Installing collected packages: pyyaml, tokenizers, sacremoses, huggingface-hub, transformers
  Attempting uninstall: pyyaml
    Fo

In [None]:
%pip install datasets

Collecting datasets
  Downloading datasets-2.0.0-py3-none-any.whl (325 kB)
[K     |████████████████████████████████| 325 kB 7.7 MB/s 
[?25hCollecting xxhash
  Downloading xxhash-3.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)
[K     |████████████████████████████████| 212 kB 66.5 MB/s 
Collecting responses<0.19
  Downloading responses-0.18.0-py3-none-any.whl (38 kB)
Collecting fsspec[http]>=2021.05.0
  Downloading fsspec-2022.3.0-py3-none-any.whl (136 kB)
[K     |████████████████████████████████| 136 kB 60.6 MB/s 
Collecting aiohttp
  Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
[K     |████████████████████████████████| 1.1 MB 44.7 MB/s 
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1
  Downloading urllib3-1.25.11-py2.py3-none-any.whl (127 kB)
[K     |████████████████████████████████| 127 kB 53.2 MB/s 
Collecting async-timeout<5.0,>=4.0.0a3
  Downloading async_timeout

In [None]:
from transformers import BertForQuestionAnswering

from transformers import AutoTokenizer

from transformers import Trainer, TrainingArguments


In [None]:
from datasets import load_dataset,load_metric

In [None]:
dt = load_dataset("adversarial_qa","adversarialQA")

Downloading builder script:   0%|          | 0.00/2.90k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/2.04k [00:00<?, ?B/s]

Downloading and preparing dataset adversarial_qa/adversarialQA (download: 8.60 MiB, generated: 31.98 MiB, post-processed: Unknown size, total: 40.58 MiB) to /root/.cache/huggingface/datasets/adversarial_qa/adversarialQA/1.0.0/92356be07b087c5c6a543138757828b8d61ca34de8a87807d40bbc0e6c68f04b...


Downloading data:   0%|          | 0.00/9.02M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/30000 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/3000 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/3000 [00:00<?, ? examples/s]

Dataset adversarial_qa downloaded and prepared to /root/.cache/huggingface/datasets/adversarial_qa/adversarialQA/1.0.0/92356be07b087c5c6a543138757828b8d61ca34de8a87807d40bbc0e6c68f04b. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

In [None]:
dt

DatasetDict({
    train: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers', 'metadata'],
        num_rows: 30000
    })
    validation: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers', 'metadata'],
        num_rows: 3000
    })
    test: Dataset({
        features: ['id', 'title', 'context', 'question', 'answers', 'metadata'],
        num_rows: 3000
    })
})

In [None]:
model_name = 'distilbert-base-uncased'

In [None]:
tokenizer = AutoTokenizer.from_pretrained(model_name)

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/483 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

In [None]:
max_length = 384
stride = 128


def preprocess_training_examples(examples):
    questions = [q.strip() for q in examples["question"]]
    inputs = tokenizer(
        questions,
        examples["context"],
        max_length=max_length,
        truncation="only_second",
        stride=stride,
        return_overflowing_tokens=True,
        return_offsets_mapping=True,
        padding="max_length",
    )

    offset_mapping = inputs.pop("offset_mapping")
    sample_map = inputs.pop("overflow_to_sample_mapping")
    answers = examples["answers"]
    start_positions = []
    end_positions = []

    for i, offset in enumerate(offset_mapping):
        sample_idx = sample_map[i]
        answer = answers[sample_idx]
        start_char = answer["answer_start"][0]
        end_char = answer["answer_start"][0] + len(answer["text"][0])
        sequence_ids = inputs.sequence_ids(i)

        # Find the start and end of the context
        idx = 0
        while sequence_ids[idx] != 1:
            idx += 1
        context_start = idx
        while sequence_ids[idx] == 1:
            idx += 1
        context_end = idx - 1

        # If the answer is not fully inside the context, label is (0, 0)
        if offset[context_start][0] > start_char or offset[context_end][1] < end_char:
            start_positions.append(0)
            end_positions.append(0)
        else:
            # Otherwise it's the start and end token positions
            idx = context_start
            while idx <= context_end and offset[idx][0] <= start_char:
                idx += 1
            start_positions.append(idx - 1)

            idx = context_end
            while idx >= context_start and offset[idx][1] >= end_char:
                idx -= 1
            end_positions.append(idx + 1)

    inputs["start_positions"] = start_positions
    inputs["end_positions"] = end_positions
    return inputs

In [None]:
train_dataset = dt["train"].map(
    preprocess_training_examples,
    batched=True,
    remove_columns=dt["train"].column_names,
)
len(dt["train"]), len(train_dataset)

train_dataset = train_dataset.select(range(1000))

len(dt["train"]), len(train_dataset)


  0%|          | 0/30 [00:00<?, ?ba/s]

(30000, 1000)

In [None]:
def preprocess_validation_examples(examples):
    questions = [q.strip() for q in examples["question"]]
    inputs = tokenizer(
        questions,
        examples["context"],
        max_length=max_length,
        truncation="only_second",
        stride=stride,
        return_overflowing_tokens=True,
        return_offsets_mapping=True,
        padding="max_length",
    )

    sample_map = inputs.pop("overflow_to_sample_mapping")
    example_ids = []

    for i in range(len(inputs["input_ids"])):
        sample_idx = sample_map[i]
        example_ids.append(examples["id"][sample_idx])

        sequence_ids = inputs.sequence_ids(i)
        offset = inputs["offset_mapping"][i]
        inputs["offset_mapping"][i] = [
            o if sequence_ids[k] == 1 else None for k, o in enumerate(offset)
        ]

    inputs["example_id"] = example_ids
    return inputs

In [None]:
validation_dataset = dt["validation"].map(
    preprocess_validation_examples,
    batched=True,
    remove_columns=dt["validation"].column_names,
)
len(dt["validation"]), len(validation_dataset)

validation_dataset = validation_dataset.select(range(300))

len(dt["validation"]), len(validation_dataset)


  0%|          | 0/3 [00:00<?, ?ba/s]

(3000, 300)

In [None]:
from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

In [None]:
model = BertForQuestionAnswering.from_pretrained(model_name)

You are using a model of type distilbert to instantiate a model of type bert. This is not supported for all configurations of models and can yield errors.


Downloading:   0%|          | 0.00/256M [00:00<?, ?B/s]

Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing BertForQuestionAnswering: ['distilbert.transformer.layer.2.output_layer_norm.weight', 'distilbert.embeddings.LayerNorm.weight', 'distilbert.transformer.layer.4.sa_layer_norm.bias', 'distilbert.transformer.layer.1.output_layer_norm.weight', 'vocab_layer_norm.weight', 'distilbert.transformer.layer.4.attention.v_lin.weight', 'distilbert.transformer.layer.5.sa_layer_norm.bias', 'distilbert.transformer.layer.5.sa_layer_norm.weight', 'distilbert.transformer.layer.1.attention.k_lin.bias', 'distilbert.embeddings.word_embeddings.weight', 'vocab_transform.weight', 'distilbert.transformer.layer.5.attention.k_lin.weight', 'distilbert.transformer.layer.0.ffn.lin2.weight', 'distilbert.transformer.layer.3.sa_layer_norm.weight', 'distilbert.transformer.layer.0.attention.out_lin.weight', 'distilbert.transformer.layer.1.attention.q_lin.bias', 'distilbert.transformer.layer.2.sa_layer_norm.weight', 'distilbert.

In [None]:
args = TrainingArguments(
    "bert-finetuned-squad",
    evaluation_strategy="no",
    save_strategy="epoch",
    learning_rate=2e-5,
    num_train_epochs=1,
    weight_decay=0.01,
)

In [None]:
trainer = Trainer(
    model=model,
    args=args,
    train_dataset=train_dataset,
    eval_dataset=validation_dataset,
    tokenizer=tokenizer,
)
trainer.train()

***** Running training *****
  Num examples = 1000
  Num Epochs = 1
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 1
  Total optimization steps = 125


Step,Training Loss


Saving model checkpoint to bert-finetuned-squad/checkpoint-125
Configuration saved in bert-finetuned-squad/checkpoint-125/config.json
Model weights saved in bert-finetuned-squad/checkpoint-125/pytorch_model.bin
tokenizer config file saved in bert-finetuned-squad/checkpoint-125/tokenizer_config.json
Special tokens file saved in bert-finetuned-squad/checkpoint-125/special_tokens_map.json


Training completed. Do not forget to share your model on huggingface.co/models =)




TrainOutput(global_step=125, training_loss=5.20370849609375, metrics={'train_runtime': 3983.2326, 'train_samples_per_second': 0.251, 'train_steps_per_second': 0.031, 'total_flos': 195972567552000.0, 'train_loss': 5.20370849609375, 'epoch': 1.0})

In [None]:
model.save_pretrained("/content/607-project-adeversarial")
tokenizer.save_pretrained("/content/607-project-adeversarial")

Configuration saved in /content/607-project-adeversarial/config.json
Model weights saved in /content/607-project-adeversarial/pytorch_model.bin
tokenizer config file saved in /content/607-project-adeversarial/tokenizer_config.json
Special tokens file saved in /content/607-project-adeversarial/special_tokens_map.json


('/content/607-project-adeversarial/tokenizer_config.json',
 '/content/607-project-adeversarial/special_tokens_map.json',
 '/content/607-project-adeversarial/vocab.txt',
 '/content/607-project-adeversarial/added_tokens.json',
 '/content/607-project-adeversarial/tokenizer.json')

In [None]:
from huggingface_hub import notebook_login

notebook_login()

Login successful
Your token has been saved to /root/.huggingface/token
[1m[31mAuthenticated through git-credential store but this isn't the helper defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set this credential helper as the default

git config --global credential.helper store[0m


In [None]:
!curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
!sudo apt-get install git-lfs
!git lfs install

Detected operating system as Ubuntu/bionic.
Checking for curl...
Detected curl...
Checking for gpg...
Detected gpg...
Running apt-get update... done.
Installing apt-transport-https... done.
Installing /etc/apt/sources.list.d/github_git-lfs.list...done.
Importing packagecloud gpg key... done.
Running apt-get update... done.

The repository is setup! You can now install packages.
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  git-lfs
0 upgraded, 1 newly installed, 0 to remove and 96 not upgraded.
Need to get 6,800 kB of archives.
After this operation, 15.3 MB of additional disk space will be used.
Get:1 https://packagecloud.io/github/git-lfs/ubuntu bionic/main amd64 git-lfs amd64 3.1.2 [6,800 kB]
Fetched 6,800 kB in 0s (15.8 MB/s)
debconf: unable to initialize frontend: Dialog
debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl

In [None]:
%cd /content/bert-finetuned-squad/checkpoint-125
!git clone https://{KrishnaAgarwal16}:{}@github.com/{KrishnaAgarwal16}/{607-Project}.git

/content/bert-finetuned-squad/checkpoint-125
Cloning into '{607-Project}'...
fatal: unable to access 'https://{KrishnaAgarwal16}:{}@github.com/{KrishnaAgarwal16}/{607-Project}.git/': The requested URL returned error: 400


In [None]:
model.push_to_hub("607-project-adversarial")

Cloning https://huggingface.co/KrishnaAgarwal16/607-project-adversarial into local empty directory.
Configuration saved in 607-project-adversarial/config.json
Model weights saved in 607-project-adversarial/pytorch_model.bin


Upload file pytorch_model.bin:   0%|          | 32.0k/415M [00:00<?, ?B/s]

To https://huggingface.co/KrishnaAgarwal16/607-project-adversarial
   3200090..7504f83  main -> main



'https://huggingface.co/KrishnaAgarwal16/607-project-adversarial/commit/7504f833dcdbcbfc48f7cf6286601ed5b0d5f535'

In [None]:
model_name = "KrishnaAgarwal16/607-project-adversarial"
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
model = BertForQuestionAnswering.from_pretrained(model_name)

loading configuration file https://huggingface.co/distilbert-base-uncased/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/23454919702d26495337f3da04d1655c7ee010d5ec9d77bdb9e399e00302c0a1.91b885ab15d631bf9cee9dc9d25ece0afd932f2f5130eba28f2055b2220c0333
Model config DistilBertConfig {
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "transformers_version": "4.18.0",
  "vocab_size": 30522
}

loading file https://huggingface.co/distilbert-base-uncased/resolve/main/vocab.txt from cache at /root/.cache/huggingface/transformers/0e1bbfda7f63a99bb52e3915dcf10

Downloading:   0%|          | 0.00/415M [00:00<?, ?B/s]

storing https://huggingface.co/KrishnaAgarwal16/607-project-adversarial/resolve/main/pytorch_model.bin in cache at /root/.cache/huggingface/transformers/9157f6dabd3c9c7378b8ac559fbffd05ce462ba551b541e4977a39b40e8b08e8.011ba628c43abf46c9bf1431f93f7cfeb411e9cb89b7a6967897e848393b7677
creating metadata file for /root/.cache/huggingface/transformers/9157f6dabd3c9c7378b8ac559fbffd05ce462ba551b541e4977a39b40e8b08e8.011ba628c43abf46c9bf1431f93f7cfeb411e9cb89b7a6967897e848393b7677
loading weights file https://huggingface.co/KrishnaAgarwal16/607-project-adversarial/resolve/main/pytorch_model.bin from cache at /root/.cache/huggingface/transformers/9157f6dabd3c9c7378b8ac559fbffd05ce462ba551b541e4977a39b40e8b08e8.011ba628c43abf46c9bf1431f93f7cfeb411e9cb89b7a6967897e848393b7677
All model checkpoint weights were used when initializing BertForQuestionAnswering.

All the weights of BertForQuestionAnswering were initialized from the model checkpoint at KrishnaAgarwal16/607-project-adversarial.
If your 

In [None]:
from transformers import pipeline

In [None]:
model_name = "KrishnaAgarwal16/607-project-adversarial"
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
model = BertForQuestionAnswering.from_pretrained(model_name)

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/483 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/892 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/415M [00:00<?, ?B/s]

In [None]:
qa = pipeline('question-answering',model,tokenizer=tokenizer)

In [None]:
score = []
for i in range(len(Questions_List)):
  x = qa({
    'question':Questions_List[i],
    'context':context
      })
  print(x)
  score.append(x)




  tensor = as_tensor(value)
  for span_id in range(num_spans)


{'score': 0.001872930326499045, 'start': 82583, 'end': 82618, 'answer': 'presidential election, 1864\nUlysses'}
{'score': 0.00113432586658746, 'start': 300, 'end': 312, 'answer': 'presidential'}
{'score': 0.0018682500813156366, 'start': 82583, 'end': 82618, 'answer': 'presidential election, 1864\nUlysses'}
{'score': 0.0016789911314845085, 'start': 82583, 'end': 82595, 'answer': 'presidential'}
{'score': 0.001317953341640532, 'start': 69868, 'end': 69880, 'answer': 'Presidential'}
{'score': 0.001673764898441732, 'start': 82583, 'end': 82595, 'answer': 'presidential'}
{'score': 0.0011371343862265348, 'start': 300, 'end': 312, 'answer': 'presidential'}
{'score': 0.0014163665473461151, 'start': 60486, 'end': 60498, 'answer': 'Presidential'}
{'score': 0.0015923931496217847, 'start': 60486, 'end': 60498, 'answer': 'Presidential'}
{'score': 0.0011301416670903563, 'start': 300, 'end': 312, 'answer': 'presidential'}
{'score': 0.0016768764471635222, 'start': 82583, 'end': 82595, 'answer': 'presi

In [None]:
score

[{'answer': 'presidential election, 1864\nUlysses',
  'end': 82618,
  'score': 0.001872930326499045,
  'start': 82583},
 {'answer': 'presidential',
  'end': 312,
  'score': 0.00113432586658746,
  'start': 300},
 {'answer': 'presidential election, 1864\nUlysses',
  'end': 82618,
  'score': 0.0018682500813156366,
  'start': 82583},
 {'answer': 'presidential',
  'end': 82595,
  'score': 0.0016789911314845085,
  'start': 82583},
 {'answer': 'Presidential',
  'end': 69880,
  'score': 0.001317953341640532,
  'start': 69868},
 {'answer': 'presidential',
  'end': 82595,
  'score': 0.001673764898441732,
  'start': 82583},
 {'answer': 'presidential',
  'end': 312,
  'score': 0.0011371343862265348,
  'start': 300},
 {'answer': 'Presidential',
  'end': 60498,
  'score': 0.0014163665473461151,
  'start': 60486},
 {'answer': 'Presidential',
  'end': 60498,
  'score': 0.0015923931496217847,
  'start': 60486},
 {'answer': 'presidential',
  'end': 312,
  'score': 0.0011301416670903563,
  'start': 300},

In [None]:
count = 0
for i in range(len(score)):
  count = count + 1*(score[i]['answer'] == df['Answer'][i])

acc = count/len(score)
acc


0.0

Using hugging face inbuilt pre-trained model **deepset/bert-base-cased-squad2​** for aswering the list of questions.

In [None]:
model = BertForQuestionAnswering.from_pretrained('deepset/bert-base-cased-squad2')
tokenizer = AutoTokenizer.from_pretrained('deepset/bert-base-cased-squad2')


Downloading:   0%|          | 0.00/508 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/413M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/152 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/208k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

In [None]:
qa = pipeline('question-answering',model,tokenizer=tokenizer)

In [None]:
new_score = []
for i in range(len(Questions_List)):
  x = qa({
    'question':Questions_List[i],
    'context':context
      })
  print(x)
  new_score.append(x)




  tensor = as_tensor(value)
  for span_id in range(num_spans)


{'score': 0.915874183177948, 'start': 7355, 'end': 7364, 'answer': '18 months'}
{'score': 0.9896573424339294, 'start': 8261, 'end': 8265, 'answer': '1832'}
{'score': 0.4672646224498749, 'start': 47540, 'end': 47609, 'answer': 'United States Note, the first paper currency in United States history'}
{'score': 0.9411458373069763, 'start': 57232, 'end': 57244, 'answer': 'Grace Bedell'}
{'score': 0.8433478474617004, 'start': 45061, 'end': 45065, 'answer': '1776'}
{'score': 0.9991657733917236, 'start': 48658, 'end': 48666, 'answer': 'Kentucky'}
{'score': 0.9725509285926819, 'start': 1225, 'end': 1229, 'answer': '1860'}
{'score': 0.9986327886581421, 'start': 44782, 'end': 44799, 'answer': 'John Wilkes Booth'}
{'score': 0.9876161217689514, 'start': 1728, 'end': 1744, 'answer': 'Ulysses S. Grant'}
{'score': 0.9568571448326111, 'start': 75686, 'end': 75693, 'answer': 'slavery'}
{'score': 0.7027047872543335, 'start': 5317, 'end': 5322, 'answer': 'seven'}
{'score': 0.889444887638092, 'start': 1247

In [None]:
count = 0
for i in range(len(new_score)):
  count = count + 1*(new_score[i]['answer'] == df['Answer'][i])

acc = count/len(new_score)
acc



0.42857142857142855

So, accuracy of pretrained  model is 0.42 which is far better than our untrained model which was unable to answer any question correctly.