In [1]:
# Question Answering — Extract Answers from Context (Transformers)
# Goal: Answer questions using a given passage (context).

# This is called Extractive Question Answering because the model extracts the answer from the provided text instead of generating new text.

# Tool: Hugging Face Transformers pipeline('question-answering')

# Important: The model can only answer using information present in the context.

# Let's install transformers now

!pip -q install transformers


In [2]:
# Step 1 — Import and load the QA pipeline
# This loads a pretrained model suitable for question answering.

from transformers import pipeline
import pandas as pd

qa = pipeline("question-answering")


No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


In [3]:
# Step 2 — Provide a context passage
# Think of the context as a mini reading comprehension passage. The QA model will look inside this text and try to extract the answer.

context = """
Simon Fraser University (SFU) is located in British Columbia, Canada.
The university has multiple campuses, including Burnaby, Surrey, and Vancouver.
Many students commute to campus and attend lectures in-person or online.
In some courses, lectures can be recorded when accommodations allow it.
"""


In [4]:
# Step 3 — Ask questions
# We ask a few questions that should be answerable from the context.

questions = [
    "Where is Simon Fraser University located?",
    "Which campuses does SFU have?",
    "How do many students get to campus?",
    "Can lectures be recorded?"
]


In [5]:
# Step 4 — Run question answering and store results
# The QA pipeline returns:
# 1- answer: extracted text span
# 2- score: confidence (higher usually means more confident) Sometimes it also returns start/end positions in the context.

rows = []

for q in questions:
    result = qa(question=q, context=context)

    rows.append({
        "question": q,
        "answer": result["answer"],
        "score": result["score"]
    })

qa_df = pd.DataFrame(rows)
qa_df


Unnamed: 0,question,answer,score
0,Where is Simon Fraser University located?,"British Columbia, Canada",0.9295
1,Which campuses does SFU have?,"Burnaby, Surrey, and Vancouver",0.925888
2,How do many students get to campus?,commute to campus and attend lectures in-perso...,0.38983
3,Can lectures be recorded?,when accommodations allow it,0.269438
