# **Question Answering**

Extractive question answering is the task of extracting an answer from a text to a given question. In question answering, text summarization methods are used to find answers to user questions in documents [[1]](#scrollTo=5aLTXh5Sa1bC).

This notebook shows an example of extractive question answering with the SQuAD dataset.

## **Question answering with SQuAD dataset**

The Stanford Question Answering Dataset (SQuAD) is a question answering dataset consisting of 100,000+ questions about a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage [[??]](https://nlp.stanford.edu/pubs/rajpurkar2016squad.pdf). The dataset is freely available at [[??]](https://stanford-qa.com).




 and the ``transformers`` library


In this section, we will demonstrate a simple introduction to question answering based on example on huggingface.com  [[2]](https://huggingface.co/transformers/task_summary.html#extractive-question-answering).

SQuAD dataset is an example of a question answering dataset and it is entirely based on that task.<br>





### Install ``transformers``

In [1]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.20.1-py3-none-any.whl (4.4 MB)
[K     |████████████████████████████████| 4.4 MB 6.7 MB/s 
Collecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.8.1-py3-none-any.whl (101 kB)
[K     |████████████████████████████████| 101 kB 7.2 MB/s 
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
  Downloading tokenizers-0.12.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB)
[K     |████████████████████████████████| 6.6 MB 51.4 MB/s 
[?25hCollecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 48.8 MB/s 
Installing collected packages: pyyaml, tokenizers, huggingface-hub, transformers
  Attempting uninstall: pyyaml
    Found existing installation: PyYAML 3.13
    Uninstall

### Import ``pipeline``

To immediately use a model on a given text, we use the ``pipeline`` API. Pipelines group together a pre-trained model with the preprocessing that was used during that model's training. Many NLP tasks have a pre-trained pipeline ready to use  [[3]](https://pypi.org/project/transformers/).




In [2]:
# Import pipeline from transformers library
from transformers import pipeline

### Create a pipeline

In [3]:
# Create a 'question-answering' pipeline
nlp = pipeline("question-answering")

No model was supplied, defaulted to distilbert-base-cased-distilled-squad (https://huggingface.co/distilbert-base-cased-distilled-squad)


Downloading:   0%|          | 0.00/473 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/249M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/208k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/426k [00:00<?, ?B/s]

### Create a sample text

In [4]:
# Create some text
context = r"""
... Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
... question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
... a model on a SQuAD task, you may leverage the examples/question-answering/run_squad.py script.
... """

### Apply question answering model on the given text

In [5]:
# Print the answers for the following questions:
## "What is extractive question answering?"
## "What is a good example of a question answering dataset?"

result = nlp(question="What is extractive question answering?", context=context)
print(f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}")

result = nlp(question="What is a good example of a question answering dataset?", context=context)
print(f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}")

Answer: 'the task of extracting an answer from a text given a question', score: 0.564, start: 38, end: 99
Answer: 'SQuAD dataset', score: 0.4472, start: 155, end: 168


# **References**

- [1] NLP and Computer Vision_DLMAINLPCV01 Course Book
- [2] https://huggingface.co/transformers/task_summary.html#extractive-question-answering)
- [3] https://pypi.org/project/transformers/

Copyright © 2022 IU International University of Applied Sciences