### Purpose
### Evaluation of various BERT models for extractive Q&A task
* Given two sequences, a question and a context, within that context, BERT extracts our answer to the question.

### Abstract
* We are going to evaluate various BERT models for extractive Q&A task.  The first three models are popular models from Hugging Face.  The last model is a fine-tuned version of the "bert-large-uncased" pretrained model.  We use three simple questions to test each model and compare the result.

### Team Members
* Sean Tran - 101449600
* Mohammed Mujtaba Rabbani  - 101387404

### Three simple questions to evaluate the models
1.  Trivia Question:  Where is the capital of Canada?
2.  Information seeking:  Where is Sean living these days?
3.  Seeking information from a long text:  When was the new province of Upper Canada created?
*
4.  A fourth Bogus Question was asked, where the required info is not present in the text.

### Long text used in question3
* wikipedia https://en.wikipedia.org/wiki/George_Brown_College
* "In September 2012, George Brown opened the Waterfront Campus located at 51 Dockside Dr., south of Queen's Quay between Jarvis and Parliament Streets 
(between Corus Quay and Redpath Sugar Refinery). This campus is home to the Centre for Health Sciences. In 2019, the college expanded its Waterfront Campus 
to the Daniels Waterfront - City of the Arts complex at 3 Lower Jarvis St. - home to the School of Design. And the latest Waterfront Campus expansion, Limberlost Place, 
is set to open at 185 Queens Quay E. in 2025. The 10-storey tall-wood, mass-timber building will be the first institutional building of its kind in Ontario and will house 
the School of Architectural Studies and the School of Computer Technology. It will also house a research institute and a child care centre."



### Bogus Question: When was the new province of Upper Canada created?
* wikipedia https://en.wikipedia.org/wiki/Toronto
* Just_for_fun_text: During the American Revolutionary War, an influx of British settlers arrived there as United Empire Loyalists fled for the British-controlled
lands north of Lake Ontario. The Crown granted them land to compensate for their losses in the Thirteen Colonies. The new province of
Upper Canada was being created and needed a capital. In 1787, the British Lord Dorchester arranged for the Toronto Purchase with the
Mississaugas of the New Credit First Nation, thereby securing more than a quarter of a million acres (1000 km2) of land in the Toronto area.
Dorchester intended the location to be named Toronto. The first 25 years after the Toronto purchase were quiet, although
"there were occasional independent fur traders" present in the area, with the usual complaints of debauchery and drunkenness."
* The answer is not "1787"

### Model1 from Hugging Face: DistilBERT base cased distilled SQuAD
* https://huggingface.co/distilbert-base-cased-distilled-squad

In [None]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
from transformers import pipeline

In [None]:
question_answerer = pipeline("question-answering", model='distilbert-base-cased-distilled-squad')

Downloading (…)lve/main/config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/261M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

#### Simple Triva Question: Where is the capital of Canada?

In [None]:
context="The capital of Canada is Ottawa."
question="Where is the captial of Canada?"

question_answerer(question=question, context=context)

{'score': 0.9686177968978882, 'start': 25, 'end': 31, 'answer': 'Ottawa'}

#### Simple Information Seeking Question: Where is Sean living these days?

In [None]:
context="These days Sean lives in Toronto but his sisters had moved to Brampton."
question="Where is Sean living these days?"

question_answerer(question=question, context=context)

{'score': 0.9774247407913208, 'start': 25, 'end': 32, 'answer': 'Toronto'}

#### Simple Information Seeking Question: Where does everyone live?

In [None]:
question="Where does everyone live?"
context="Everyone study at GBC, but they live out of town."

question_answerer(question=question, context=context)

{'score': 0.6477454304695129, 'start': 37, 'end': 48, 'answer': 'out of town'}

#### Seeking information from a long text

In [None]:
# https://en.wikipedia.org/wiki/George_Brown_College

long_text1 = """
In September 2012, George Brown opened the Waterfront Campus located at 51 Dockside Dr., south of Queen's Quay between Jarvis and Parliament Streets 
(between Corus Quay and Redpath Sugar Refinery). This campus is home to the Centre for Health Sciences. In 2019, the college expanded its Waterfront Campus 
to the Daniels Waterfront - City of the Arts complex at 3 Lower Jarvis St. - home to the School of Design. And the latest Waterfront Campus expansion, Limberlost Place, 
is set to open at 185 Queens Quay E. in 2025. The 10-storey tall-wood, mass-timber building will be the first institutional building of its kind in Ontario and will house 
the School of Architectural Studies and the School of Computer Technology. It will also house a research institute and a child care centre.
"""
question = "Where is Waterfront Campus located?"

question_answerer(question=question, context= long_text1)

{'score': 0.9090490937232971,
 'start': 73,
 'end': 88,
 'answer': '51 Dockside Dr.'}

#### Seeking information not explicitly available

In [None]:
# wikipedia https://en.wikipedia.org/wiki/Toronto

long_text_just_for_fun = """
During the American Revolutionary War, an influx of British settlers arrived there as United Empire Loyalists fled for the British-controlled
lands north of Lake Ontario. The Crown granted them land to compensate for their losses in the Thirteen Colonies. The new province of
Upper Canada was being created and needed a capital. In 1787, the British Lord Dorchester arranged for the Toronto Purchase with the
Mississaugas of the New Credit First Nation, thereby securing more than a quarter of a million acres (1000 km2) of land in the Toronto area.
Dorchester intended the location to be named Toronto. The first 25 years after the Toronto purchase were quiet, although
"there were occasional independent fur traders" present in the area, with the usual complaints of debauchery and drunkenness.
"""
trick_question = "When was the new province of Upper Canada created?"

question_answerer(question=trick_question, context=long_text_just_for_fun)

{'score': 0.9288268685340881, 'start': 333, 'end': 337, 'answer': '1787'}

### Model2 from Hugging Face: RoBERTa-based-SQuAD2
* https://huggingface.co/deepset/roberta-base-squad2
* Language model: roberta-base
* Training data: SQuAD 2.0
* Evaluating data: SQuAD 2.0
* Infrastructure: 4x Tesla v100

In [None]:
# !pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.27.3-py3-none-any.whl (6.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.8/6.8 MB[0m [31m43.3 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.11.0
  Downloading huggingface_hub-0.13.3-py3-none-any.whl (199 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m199.8/199.8 KB[0m [31m18.6 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.6/7.6 MB[0m [31m71.8 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.13.3 tokenizers-0.13.2 transformers-4.27.3


In [None]:
from transformers import pipeline, BertForQuestionAnswering

In [None]:
model_name = "deepset/roberta-base-squad2"
qa = pipeline(model=model_name, tokenizer=model_name, revison="v1.0", task="question-answering")

Downloading (…)lve/main/config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/496M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/79.0 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

#### Simple Triva Question: Where is the capital of Canada?

In [None]:
context="The capital of Canada is Ottawa."
question="Where is the captial of Canada?"

sequence = question, context
qa(*sequence)

{'score': 0.8784348368644714, 'start': 25, 'end': 31, 'answer': 'Ottawa'}

#### Simple Information Seeking Question: Where is Sean living these days?

In [None]:
context="These days Sean lives in Toronto but his sisters had moved to Brampton."
question="Where is Sean living these days?"

sequence = question, context
qa(*sequence)

{'score': 0.9577560424804688, 'start': 25, 'end': 32, 'answer': 'Toronto'}

#### Simple Information Seeking Question: Where does everyone live?

In [None]:
question="Where does everyone live?"
context="Everyone study at GBC, but they live out of town."

sequence = question, context
qa(*sequence)

{'score': 0.37237274646759033, 'start': 37, 'end': 48, 'answer': 'out of town'}

#### Seeking information from a long text

In [None]:
# https://en.wikipedia.org/wiki/George_Brown_College

long_text1 = """
In September 2012, George Brown opened the Waterfront Campus located at 51 Dockside Dr., south of Queen's Quay between Jarvis and Parliament Streets 
(between Corus Quay and Redpath Sugar Refinery). This campus is home to the Centre for Health Sciences. In 2019, the college expanded its Waterfront Campus 
to the Daniels Waterfront - City of the Arts complex at 3 Lower Jarvis St. - home to the School of Design. And the latest Waterfront Campus expansion, Limberlost Place, 
is set to open at 185 Queens Quay E. in 2025. The 10-storey tall-wood, mass-timber building will be the first institutional building of its kind in Ontario and will house 
the School of Architectural Studies and the School of Computer Technology. It will also house a research institute and a child care centre.
"""
question = "Where is Waterfront Campus located?"

sequence = question, long_text1
qa(*sequence)

{'score': 0.23121747374534607,
 'start': 73,
 'end': 87,
 'answer': '51 Dockside Dr'}

#### Seeking information not explicitly available

In [None]:
# wikipedia https://en.wikipedia.org/wiki/Toronto

long_text = """
During the American Revolutionary War, an influx of British settlers arrived there as United Empire Loyalists fled for the British-controlled
lands north of Lake Ontario. The Crown granted them land to compensate for their losses in the Thirteen Colonies. The new province of
Upper Canada was being created and needed a capital. In 1787, the British Lord Dorchester arranged for the Toronto Purchase with the
Mississaugas of the New Credit First Nation, thereby securing more than a quarter of a million acres (1000 km2) of land in the Toronto area.
Dorchester intended the location to be named Toronto. The first 25 years after the Toronto purchase were quiet, although
"there were occasional independent fur traders" present in the area, with the usual complaints of debauchery and drunkenness.
"""
question = "When was the new province of Upper Canada created?"

sequence = question, long_text
qa(*sequence)

{'score': 0.48114240169525146, 'start': 333, 'end': 337, 'answer': '1787'}

### Model3 from Hugging Face: BERT large model (uncased) whole word masking finetuned on SQuAD

In [None]:
squad_pipe = pipeline("question-answering", "bert-large-uncased-whole-word-masking-finetuned-squad")

Downloading (…)lve/main/config.json:   0%|          | 0.00/443 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.34G [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

#### Simple Triva Question: Where is the capital of Canada?

In [None]:
context="The capital of Canada is Ottawa."
question="Where is the captial of Canada?"

sequence = question, context
squad_pipe(*sequence)

{'score': 0.6013664603233337, 'start': 25, 'end': 31, 'answer': 'Ottawa'}

#### Simple Information Seeking Question: Where is Sean living these days?

In [None]:
context="These days Sean lives in Toronto but his sisters had moved to Brampton."
question="Where is Sean living these days?"

sequence = question, context
squad_pipe(*sequence)

{'score': 0.9870988130569458, 'start': 25, 'end': 32, 'answer': 'Toronto'}

#### Simple Information Seeking Question: Where does everyone live?

In [None]:
question="Where does everyone live?"
context="Everyone study at GBC, but they live out of town."

sequence = question, context
squad_pipe(*sequence)

{'score': 0.9751126766204834, 'start': 37, 'end': 48, 'answer': 'out of town'}

#### Seeking information from a long text

In [None]:
# https://en.wikipedia.org/wiki/George_Brown_College

long_text1 = """
In September 2012, George Brown opened the Waterfront Campus located at 51 Dockside Dr., south of Queen's Quay between Jarvis and Parliament Streets 
(between Corus Quay and Redpath Sugar Refinery). This campus is home to the Centre for Health Sciences. In 2019, the college expanded its Waterfront Campus 
to the Daniels Waterfront - City of the Arts complex at 3 Lower Jarvis St. - home to the School of Design. And the latest Waterfront Campus expansion, Limberlost Place, 
is set to open at 185 Queens Quay E. in 2025. The 10-storey tall-wood, mass-timber building will be the first institutional building of its kind in Ontario and will house 
the School of Architectural Studies and the School of Computer Technology. It will also house a research institute and a child care centre.
"""
question = "Where is Waterfront Campus located?"

sequence = question, long_text1
squad_pipe(*sequence)

{'score': 0.3580397069454193,
 'start': 73,
 'end': 88,
 'answer': '51 Dockside Dr.'}

#### Seeking information not explicitly available

In [None]:
# wikipedia https://en.wikipedia.org/wiki/Toronto

long_text = """
During the American Revolutionary War, an influx of British settlers arrived there as United Empire Loyalists fled for the British-controlled
lands north of Lake Ontario. The Crown granted them land to compensate for their losses in the Thirteen Colonies. The new province of
Upper Canada was being created and needed a capital. In 1787, the British Lord Dorchester arranged for the Toronto Purchase with the
Mississaugas of the New Credit First Nation, thereby securing more than a quarter of a million acres (1000 km2) of land in the Toronto area.
Dorchester intended the location to be named Toronto. The first 25 years after the Toronto purchase were quiet, although
"there were occasional independent fur traders" present in the area, with the usual complaints of debauchery and drunkenness.
"""
question = "When was the new province of Upper Canada created?"

sequence = question, long_text
squad_pipe(*sequence)

{'score': 0.7342039346694946, 'start': 333, 'end': 337, 'answer': '1787'}

### Model4: Fine-Tune the "bert-large-uncased" pretrained model

#### Dataset used: a variation of adversarial_qa, used in the O'Reilly video series
* https://huggingface.co/datasets/adversarial_qa

In [None]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
!pip install datasets

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
from transformers import BertTokenizerFast, BertForQuestionAnswering, pipeline, \
                         DataCollatorWithPadding, TrainingArguments, Trainer, \
                         AutoModelForQuestionAnswering, AutoTokenizer
from datasets import Dataset
import pandas as pd

In [None]:
bert_tokenizer = BertTokenizerFast.from_pretrained('bert-large-uncased', return_token_type_ids=True)

qa_bert = BertForQuestionAnswering.from_pretrained('bert-large-uncased')

Some weights of the model checkpoint at bert-large-uncased were not used when initializing BertForQuestionAnswering: ['cls.predictions.transform.dense.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-large-uncased

In [None]:
# mount google drive if needed
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [None]:
qa_df = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/data/qa.csv')

qa_df.shape

(29989, 5)

In [None]:
qa_df.head()

Unnamed: 0,question,context,start_positions,end_positions,answer
0,What sare the benifts of the blood brain barrir?,Another approach to brain function is to exami...,56,60,isolated from the bloodstream
1,What is surrounded by cerebrospinal fluid?,Another approach to brain function is to exami...,16,16,brain
2,What does the skull protect?,Another approach to brain function is to exami...,11,11,brain
3,What has been injected into rats to produce pr...,Another approach to brain function is to exami...,153,153,chemicals
4,What can cause issues with how the brain works?,Another approach to brain function is to exami...,93,94,brain damage


In [None]:
# use only 4,000 examples to speed up training time
# qa_dataset = Dataset.from_pandas(qa_df.sample(4000, random_state=42))

# Testing 8000 samples
qa_dataset = Dataset.from_pandas(qa_df.sample(8000, random_state=42))

# Dataset has a built in train test split method
qa_dataset = qa_dataset.train_test_split(test_size=0.2)

In [None]:
# standard preprocessing here with truncation on to truncate longer text
def preprocess(data):
    return bert_tokenizer(data['question'], data['context'], truncation=True)

qa_dataset = qa_dataset.map(preprocess, batched=True)

Map:   0%|          | 0/6400 [00:00<?, ? examples/s]

Map:   0%|          | 0/1600 [00:00<?, ? examples/s]

#### Freeze all pre-trained layers except header

In [None]:
# freeze all but the last 2 encoder layers in BERT to speed up training
for name, param in qa_bert.bert.named_parameters():
    if 'encoder.layer.22' in name:
        break
    param.requires_grad = False  # disable training in BERT

In [None]:
data_collator = DataCollatorWithPadding(tokenizer=bert_tokenizer)

In [None]:
!mkdir qa_result_8K_3epoch
!mkdir qa_logs_8K_3epoch

In [None]:
batch_size = 32
epochs = 3

training_args = TrainingArguments(
    output_dir='./qa_result_8K_3epoch',
    num_train_epochs=epochs,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    logging_dir='./qa_logs_8K_3epoch',
    save_strategy='epoch',
    logging_steps=10,
    evaluation_strategy='epoch',
    load_best_model_at_end=True
)

trainer = Trainer(
    model=qa_bert,
    args=training_args,
    train_dataset=qa_dataset['train'],
    eval_dataset=qa_dataset['test'],
    data_collator=data_collator
)

# Get initial metrics
trainer.evaluate()

{'eval_loss': 5.853602409362793,
 'eval_runtime': 23.5485,
 'eval_samples_per_second': 67.945,
 'eval_steps_per_second': 2.123}

In [None]:
trainer.train()   # This is a long process



Epoch,Training Loss,Validation Loss
1,4.2052,4.080186
2,3.8932,3.922663
3,3.7087,3.911666


TrainOutput(global_step=600, training_loss=4.025537643432617, metrics={'train_runtime': 433.7003, 'train_samples_per_second': 44.27, 'train_steps_per_second': 1.383, 'total_flos': 1.2556757019627648e+16, 'train_loss': 4.025537643432617, 'epoch': 3.0})

#### Save the trained model for future use

In [None]:
trainer.save_model()

#### Evaluation:

In [None]:
pipe = pipeline("question-answering", './qa_result_8K_3epoch', tokenizer=bert_tokenizer)

#### Simple Triva Question: Where is the capital of Canada?

In [None]:
context="The capital of Canada is Ottawa."
question="Where is the capital of Canada?"

pipe(question=question, context=context)

{'score': 0.3064378798007965, 'start': 25, 'end': 31, 'answer': 'Ottawa'}

#### Simple Information Seeking Question: Where is Sean living these days?

In [None]:
pipe("Where is Sean living these days?", "These days Sean lives in Toronto but his sisters had moved to Brampton.")

{'score': 0.16663514077663422, 'start': 25, 'end': 32, 'answer': 'Toronto'}

#### Seeking information from a long text

In [None]:
# https://en.wikipedia.org/wiki/George_Brown_College

long_text1 = """
In September 2012, George Brown opened the Waterfront Campus located at 51 Dockside Dr., south of Queen's Quay between Jarvis and Parliament Streets 
(between Corus Quay and Redpath Sugar Refinery). This campus is home to the Centre for Health Sciences. In 2019, the college expanded its Waterfront Campus 
to the Daniels Waterfront - City of the Arts complex at 3 Lower Jarvis St. - home to the School of Design. And the latest Waterfront Campus expansion, Limberlost Place, 
is set to open at 185 Queens Quay E. in 2025. The 10-storey tall-wood, mass-timber building will be the first institutional building of its kind in Ontario and will house 
the School of Architectural Studies and the School of Computer Technology. It will also house a research institute and a child care centre.
"""
question = "Where is Waterfront Campus located?"

pipe(question=question, context=long_text1)

{'score': 0.004953663796186447,
 'start': 459,
 'end': 475,
 'answer': 'Limberlost Place'}

In [None]:
# wikipedia https://en.wikipedia.org/wiki/Toronto

long_text_just_for_fun = """
During the American Revolutionary War, an influx of British settlers arrived there as United Empire Loyalists fled for the British-controlled
lands north of Lake Ontario. The Crown granted them land to compensate for their losses in the Thirteen Colonies. The new province of
Upper Canada was being created and needed a capital. In 1787, the British Lord Dorchester arranged for the Toronto Purchase with the
Mississaugas of the New Credit First Nation, thereby securing more than a quarter of a million acres (1000 km2) of land in the Toronto area.
Dorchester intended the location to be named Toronto. The first 25 years after the Toronto purchase were quiet, although
"there were occasional independent fur traders" present in the area, with the usual complaints of debauchery and drunkenness.
"""
trick_question = "When was the new province of Upper Canada created?"

pipe(question=trick_question, context=long_text_just_for_fun)

{'score': 0.004785825498402119, 'start': 176, 'end': 181, 'answer': 'Crown'}

### Comparing Result of Models from Hugging Face
* These models can pick out trivial answers with confidence.
* They also pick out erroneous answer with high probability

|Hugging Face Model|Q1 score|Q2 score|Q3 score|Bogus Question|
|--|--|--|--|--|
|DistilBERT|0.968|0.977|0.909|0.929|
|roberta-base-squad2|0.878|0.958|0.231|0.481|
|BERT large uncased|0.601|0.987|0.358|0.734|



### Comparing Result of Fine-Tuned Model
* Our fine-tuned model can pick out the right answer albeit with low probability.

|Samples|Epoch|Train (sec)|Train Loss|Val Loss|Q1 score|Q2 score|Q3 score|
|-----|--|--|--|--|--|--|--|
|4,000|2|150|4.12|4.21|0.187|0.101|0.002|
|4,000|3|222|3.13|4.32|0.239|0.223|0.004|
|4,000|4|306|3.68|4.08|0.192|0.189|0.004|
|8,000|3|433|3.71|3.91|0.306|0.167|X   |

### Summary and Conclusion
* In this notebook, we seek to evaluate various BERT models from Hugging Face.  We also fine-tuned a large-uncased pre-trained BERT model by retraining the header using a semi-customed dataset.
*  The models from Hugging Face performed exceptionally well on trivial questions, and are quite capable of finding information from a long text.  Hovever, these models also picked out erroneous information with very high confidence.  These model erroneously pick out "1787" as the answer to the bogus question, even though the answer to the question can not be found in the long text.
* Our fine-tuned model, while it was able to pick out the correct information for both the trivial questions and question of longer-text passage.  It suffers from low probability score.  This could be because of our small dataset or short training time.

### References
* Model and data were found at Hugging Face site.
* Fine-Tuned model followed the precedure outline in O'Reilly video series "Intro to Transformer Models for NLP" by Sinan Ozdemir
* Hugging Face tutorials on Youtube
* Wikipedia for long-text passage