<div class="alert alert-success"><h1>Question Answering with Pretrained Models in Python</h1></div>

**Question answering (QA)** is a critical application of natural language processing (NLP) that enables machines to provide direct answers to questions based on a given text passage. With deep learning and pretrained models, we can build powerful QA systems that either extract the answer from the text (**extractive QA**) or generate a novel response (**abstractive QA**). In this tutorial, we'll limit our focus to extractive question answering using a Hugging Face pre-trained model.

## Learning Objectives
By the end of this tutorial, you will be able to:
+ **Implement a question asnwering pipeline:** Build and run an extractive question answering pipeline that retrieves answer spans directly from the provided context.
+ **Analyze the QA output:** Understand the output, including score metrics and answer spans.


## Prerequisites
Before we begin, please ensure that you have:
+ A working knowledge of Python, including variables, functions, loops, and basic object-oriented programming.
+ Familiarity with deep learning model development in Python using Keras and TensorFlow.
+ A Python (version 3.x) environment with the `tensorflow`, `keras`, `ipywidgets`, and `transformers` packages installed.

Let's also reduce the log verbosity of the `transformers` package. This ensures that we only get error alerts but not informational logs.

In [1]:
from transformers import logging
logging.set_verbosity_error()

<hr>

## 1. Instantiate a Pipeline for Question Answering
The first thing we do is import the `pipeline` function from the Hugging Face `transformers` package. Then we instantiate a pipeline object called `answerer` while specifying `"question-answering"` as the task.

In [2]:
from transformers import pipeline
answerer = pipeline(task = "question-answering")

## 2. Answer Questions Based on Provided Text
Next, we provide a source document from which questions should be answered. 

In [3]:
context_text = """
The Apollo missions were a series of space missions conducted by NASA (National Aeronautics 
and Space Administration) between 1961 and 1972 with the primary objective of landing humans 
on the Moon and safely returning them to Earth. These missions marked a significant milestone in 
space exploration, pushing the boundaries of human capability beyond Earth’s atmosphere.

The most famous mission, Apollo 11, successfully landed astronauts Neil Armstrong and Buzz Aldrin 
on the Moon on July 20, 1969, while Michael Collins piloted the command module in lunar orbit. This 
historic event marked the first time humans set foot on another celestial body, with Armstrong’s 
famous words: "That's one small step for man, one giant leap for mankind."

Beyond the technological and scientific advancements, the Apollo missions provided valuable data 
about the Moon’s surface, geological composition, and atmosphere. The missions also tested life 
support systems, spacecraft engineering, and astronaut endurance in deep space. These insights 
have been critical in shaping future space exploration, including potential human missions to Mars 
and beyond.

The success of the Apollo program demonstrated the feasibility of human space travel and laid the 
groundwork for subsequent missions like the Space Shuttle program, the International Space 
Station (ISS), and modern lunar exploration projects like Artemis. Moreover, the knowledge gained 
from these missions continues to influence discussions on the potential colonization of other 
celestial bodies, advancing our understanding of the possibilities for long-term human habitation 
beyond Earth.
"""

Then we specify the question that we want answered. The model will identify the answer span within the source document that best answers this question.

In [4]:
question_text = "What were the primary objectives of NASA’s Apollo missions?"

Finally, we pass both the source text and the question to our pipeline.

In [5]:
answer = answerer(
    question = question_text,
    context = context_text
)

print(answer)

{'score': 0.789436399936676, 'start': 172, 'end': 234, 'answer': 'landing humans \non the Moon and safely returning them to Earth'}


The pipeline processes the provided context and question, then returns a dictionary with keys such as `'answer'`, `'score'`, `'start'`, and `'end'`. These indicate the extracted answer, the model’s confidence score, and the character positions of the answer span in the context.

Let's reformat the output to make it easier to read.

In [6]:
print(f"Question: {question_text}")
answer_text = answer['answer'].replace('\n','')
print(f"Answer: '{answer_text}'")
print(f"Score: {answer['score']}, Start: {answer['start']}, End: {answer['end']}")

Question: What were the primary objectives of NASA’s Apollo missions?
Answer: 'landing humans on the Moon and safely returning them to Earth'
Score: 0.789436399936676, Start: 172, End: 234


This reformatting improves readability and enhances the interpretability of the model’s output by clearly presenting both the extracted answer and relevant details about its context.

It is important to understand that the extractive QA approach used here retrieves answers directly from a specific span of the provided text. This ensures that responses remain verifiable and fully supported by the original source. However, this method is relatively basic in its approach to question answering. In contrast, an abstractive model generates responses in its own words, often producing more concise and natural-sounding answers. While this can improve readability, it also comes with drawbacks. An abstractive model may introduce information that is not explicitly stated in the original text, potentially leading to inaccuracies or hallucinated details.

The choice between extractive and abstractive question answering ultimately depends on the application's needs. Extractive QA is ideal for scenarios requiring precise, evidence-backed answers, ensuring factual accuracy. On the other hand, abstractive QA offers greater flexibility and fluency, though it may sometimes prioritize readability over strict factual precision.