# Assignment 1 - Question Answering

This notebook illustrates question answering given a text with both answerable and unanswerable questions.

In [2]:
import os
import openai
import json
from openai import OpenAI

In [3]:
openai.api_key = os.environ['OPENAI_API_KEY']

## Part 1: Add a New Question and Validate

In this section, we add a new question using the SQuAD v2.0 format and validate that the populated template is correct using a few rules.

In [5]:
data = {
    "context": """The Random Hills are a group of rugged hills located in Victoria Land, Antarctica. They are bounded on the west by Campbell Glacier and on the east by Tinker Glacier and Wood Bay. These hills are centered about 15 nautical miles (28 km; 17 mi) north-northwest of Mount Melbourne. The name \"Random Hills\" was given by the Southern Party of the New Zealand Geological Survey Antarctic Expedition (NZGSAE) during 1966–67, due to the random orientation of the ridges that comprise the feature.""",
    "qas": [
        {
            "question": "What are the Random Hills?",
            "id": "1",
            "answers": [{"text": "a group of rugged hills located in Victoria Land, Antarctica", "answer_start": 28}],
            "is_impossible": False
        },
        {
            "question": "Which glaciers bound the Random Hills to the west and east?",
            "id": "2",
            "answers": [{"text": "Campbell Glacier and on the east by Tinker Glacier and Wood Bay", "answer_start": 72}],
            "is_impossible": False
        },
        {
            "question": "How far are the Random Hills from Mount Melbourne?",
            "id": "3",
            "answers": [{"text": "15 nautical miles (28 km; 17 mi) north-northwest of Mount Melbourne", "answer_start": 150}],
            "is_impossible": False
        },
        {
            "question": "Who named the Random Hills?",
            "id": "4",
            "answers": [{"text": "Southern Party of the New Zealand Geological Survey Antarctic Expedition (NZGSAE)", "answer_start": 209}],
            "is_impossible": False
        },
        {
            "question": "Why were the Random Hills given their name?",
            "id": "5",
            "answers": [{"text": "due to the random orientation of the ridges that comprise the feature", "answer_start": 308}],
            "is_impossible": False
        },
        {
            "plausible_answers": [{"text": "part of the McMurdo Volcanic Group", "answer_start": 0}],
            "question": "What is the geological composition of the Random Hills?",
            "id": "6",
            "answers": [],
            "is_impossible": True
        },
        {
            "plausible_answers": [{"text": "during the Miocene epoch", "answer_start": 0}],
            "question": "When were the Random Hills formed?",
            "id": "7",
            "answers": [],
            "is_impossible": True
        },
        {
            "plausible_answers": [{"text": "1,410 meters (4,630 feet)", "answer_start": 0}],
            "question": "What is the height of the tallest hill in the Random Hills?",
            "id": "8",
            "answers": [],
            "is_impossible": True
        },
        {
            "plausible_answers": [{"text": "penguins and seals", "answer_start": 0}],
            "question": "Which animals are commonly found in the Random Hills region?",
            "id": "9",
            "answers": [],
            "is_impossible": True
        },
        {
            "plausible_answers": [{"text": "geological mapping and volcanic studies", "answer_start": 0}],
            "question": "What role do the Random Hills play in Antarctic research?",
            "id": "10",
            "answers": [],
            "is_impossible": True
        }
    ]
}

# Write the data to a JSON file
json.dump(data, open('squad_questions.json', 'w+'), indent=4)

## Part 2: Prompt GPT to Answer Questions

In this section, we prompt our LM to answer the answerable and unanswerable questions and record the responses. We iterate over each question in the data structure and store the responses in answers.

In [7]:
# Initialize the OpenAI client
client = OpenAI()

# Load your SQuAD questions
with open('squad_questions.json', 'r') as f:
    squad_data = json.load(f)

# GPT Model Configuration
MODEL = "gpt-3.5-turbo"  # or gpt-4
responses = []

# Iterate over each question
for qa in squad_data['qas']:
    # Prepare the prompt
    prompt = f"""
    Context: {squad_data['context']}
    Question: {qa['question']}
    Provide only the exact answer to the question using the text from the context. 
    If the question cannot be answered based on the context, reply with "Unanswerable" without any punctuation or extra words.
    """

    # Call GPT API using the new method
    try:
        response = client.chat.completions.create(
            model=MODEL,
            messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": prompt}
            ]
        )
        # Extract the model's response
        answer = response.choices[0].message.content.strip()
        print(f"Q: {qa['question']}\nA: {answer}\n")

        # Append to responses
        responses.append({
            "question": qa['question'],
            "model_answer": answer
        })

    except Exception as e:
        print(f"Error processing question: {qa['question']}. Error: {e}")
        responses.append({
            "question": qa['question'],
            "model_answer": "Error"
        })

# Save responses to JSON
with open('responsesgpt3.5-turbo-2.json', 'w') as f:
    json.dump(responses, f, indent=4)

print("Responses saved to responsesgpt3.5-turbo-2.json")

Q: What are the Random Hills?
A: The Random Hills are a group of rugged hills located in Victoria Land, Antarctica, bounded by Campbell Glacier, Tinker Glacier, and Wood Bay.

Q: Which glaciers bound the Random Hills to the west and east?
A: Campbell Glacier bounds the Random Hills to the west, and Tinker Glacier and Wood Bay bound them to the east.

Q: How far are the Random Hills from Mount Melbourne?
A: 17 miles

Q: Who named the Random Hills?
A: The Southern Party of the New Zealand Geological Survey Antarctic Expedition (NZGSAE).

Q: Why were the Random Hills given their name?
A: The name "Random Hills" was given by the Southern Party of the New Zealand Geological Survey Antarctic Expedition (NZGSAE) during 1966–67, due to the random orientation of the ridges that comprise the feature.

Q: What is the geological composition of the Random Hills?
A: Unanswerable

Q: When were the Random Hills formed?
A: Unanswerable

Q: What is the height of the tallest hill in the Random Hills?
A: 

## Part 3: Exact Match (EM) and F1-Score Evaluation of Responses

In this section, we evaluate the responses using Exact Match (EM) and F1-Score compare each predicted answer to the expected answer.

In [9]:
# Load the expected answers and predictions
with open('squad_questions.json', 'r') as f:
    squad_data = json.load(f)

with open('responsesgpt-3.5-turbo-3.json', 'r') as f:
    predictions = json.load(f)

# Define EM and F1 scoring functions
from collections import Counter

def exact_match_score(predicted, expected):
    """Calculate the Exact Match (EM) score."""
    return int(predicted.strip() == expected.strip())

def f1_score(predicted, expected):
    """Calculate the F1 score."""
    predicted_tokens = predicted.split()
    expected_tokens = expected.split()
    predicted_counts = Counter(predicted_tokens)
    expected_counts = Counter(expected_tokens)

    common = predicted_counts & expected_counts
    tp = sum(common.values())  # True positives
    fp = len(predicted_tokens) - tp  # False positives
    fn = len(expected_tokens) - tp  # False negatives

    if tp == 0:
        return 0.0
    return (2 * tp) / (2 * tp + fp + fn)

# Initialize scores
total_em = 0
total_f1 = 0
count = 0

# Calculate scores for each question
for qa, prediction in zip(squad_data['qas'], predictions):
    expected = qa['answers'][0]['text'] if qa['answers'] else "Unanswerable"
    predicted = prediction['model_answer']

    em = exact_match_score(predicted, expected)
    f1 = f1_score(predicted, expected)

    total_em += em
    total_f1 += f1
    count += 1

    print(f"Question: {qa['question']}")
    print(f"Expected: {expected}")
    print(f"Predicted: {predicted}")
    print(f"Exact Match: {em}")
    print(f"F1 Score: {f1:.2f}\n")

# Overall scores
overall_em = total_em / count
overall_f1 = total_f1 / count

print(f"Overall Exact Match Score: {overall_em:.2f}")
print(f"Overall F1 Score: {overall_f1:.2f}")

Question: What are the Random Hills?
Expected: a group of rugged hills located in Victoria Land, Antarctica
Predicted: a group of rugged hills located in Victoria Land, Antarctica
Exact Match: 1
F1 Score: 1.00

Question: Which glaciers bound the Random Hills to the west and east?
Expected: Campbell Glacier and on the east by Tinker Glacier and Wood Bay
Predicted: Campbell Glacier and Tinker Glacier and Wood Bay.
Exact Match: 0
F1 Score: 0.70

Question: How far are the Random Hills from Mount Melbourne?
Expected: 15 nautical miles (28 km; 17 mi) north-northwest of Mount Melbourne
Predicted: about 15 nautical miles
Exact Match: 0
F1 Score: 0.40

Question: Who named the Random Hills?
Expected: Southern Party of the New Zealand Geological Survey Antarctic Expedition (NZGSAE)
Predicted: The Southern Party of the New Zealand Geological Survey Antarctic Expedition (NZGSAE)
Exact Match: 0
F1 Score: 0.96

Question: Why were the Random Hills given their name?
Expected: due to the random orientat