# Transformer Based Question Answering
One of the popular applications of Natural Language Processing (NLP) is Question Answering. In this work I have used Transformer library to build a QA system. Transformers is an open-source library developed by Hugging Face that provides a wide range of pre-trained models such as BERT, DistilBERT for QA. 
BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained transformer-based neural network model for natural language processing developed by Google AI Language. DistilBERT is a light Transformer model trained by distilling BERT base.

## Import

In [47]:
from transformers import pipeline

## DistilBERT Model
Load a pre-trained DistilBERT model from the Hugging Face model hub

In [48]:
qa = pipeline("question-answering", model="distilbert-base-cased-distilled-squad", tokenizer="distilbert-base-cased")

## Context

In [49]:
context = """It was a dark and stormy night. The old mansion on the hill loomed ominously, with its broken windows and overgrown vines. Inside, the air was musty and cold, and the only light came from flickering candles. As I made my way through the dusty halls, I couldn't shake the feeling that I was being watched. Suddenly, I heard a creaking sound behind me. I turned around to see a ghostly figure staring back at me with hollow eyes.
"""

## Generate answers from questions
Using the ''qa'' pipeline to answer questions on a given context

In [50]:
question = 'What was the weather like on the night of the story?'
question1 = "How was the ghost look like?"
question2 = "How was the feeling while walking through the dusty halls?"

answer = qa(question=question, context=context)['answer']
answer1 = qa(question=question1, context=context)['answer']
answer2 = qa(question=question2, context=context)['answer']

In [51]:
print('Context:\n',context)
print('\nQuestion 1:\n', question)
print('Answer 1:\n', answer)
print('\nQuestion 2:\n', question1)
print('Answer 2:\n', answer1)
print('\nQuestion 3:\n', question2)
print('Answer 3:\n', answer2)

Context:
 It was a dark and stormy night. The old mansion on the hill loomed ominously, with its broken windows and overgrown vines. Inside, the air was musty and cold, and the only light came from flickering candles. As I made my way through the dusty halls, I couldn't shake the feeling that I was being watched. Suddenly, I heard a creaking sound behind me. I turned around to see a ghostly figure staring back at me with hollow eyes.


Question 1:
 What was the weather like on the night of the story?
Answer 1:
 dark and stormy

Question 2:
 How was the ghost look like?
Answer 2:
 hollow eyes

Question 3:
 How was the feeling while walking through the dusty halls?
Answer 3:
 I couldn't shake the feeling that I was being watched


## Import

In [52]:
import torch
import torchvision

## BERT Model
Load the pre-trained BERT model and tokenizer

In [53]:
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad", return_dict=False )


Provide the context and question

In [54]:
context = """It was a dark and stormy night. The old mansion on the hill loomed ominously, with its broken windows and overgrown vines. Inside, the air was musty and cold, and the only light came from flickering candles. As I made my way through the dusty halls, I couldn't shake the feeling that I was being watched. Suddenly, I heard a creaking sound behind me. I turned around to see a ghostly figure staring back at me with hollow eyes.
"""
question = "How was the feeling while walking through the dusty halls?"

## Tokenization
Tokenize the inputs and encode them

In [55]:
inputs = tokenizer.encode_plus(question, context, add_special_tokens=True, return_tensors="pt")

## Get Answer using Scores
Find the start and end positions of the answer using scores:

In [56]:
start_scores, end_scores = model(**inputs)

In [57]:
start_index = torch.argmax(start_scores)
end_index = torch.argmax(end_scores)

answer_tokens = inputs["input_ids"][0][start_index:end_index+1]
answer_tokens = tokenizer.convert_ids_to_tokens(answer_tokens, skip_special_tokens=True)
answer = tokenizer.convert_tokens_to_string(answer_tokens)

In [58]:
print(answer)

i was being watched
