# Step 3: Build Basic Text-Based Tutor
Creating a simple question-answering system using SQuAD and a pre-trained model.
- Date: July 21, 2025
- This step uses the SQuAD dataset from Step 2 to answer questions from text.

In [3]:
from transformers import pipeline
import pickle

# Load SQuAD dataset from file
with open('squad_dataset.pkl', 'rb') as file:
    dataset = pickle.load(file)

print("Dataset loaded from squad_dataset.pkl!")

# Create the question-answering tool
text_qa = pipeline("question-answering", model="distilbert-base-cased-distilled-squad")

# Test with SQuAD example
context = dataset['train'][1]['context']
question = dataset['train'][1]['question']
result = text_qa(question=question, context=context)
print(f"Question: {question}")
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['score']:.2f}")

Device set to use cpu


Dataset loaded from squad_dataset.pkl!
Question: What is in front of the Notre Dame Main Building?
Answer: a copper statue of Christ
Confidence: 0.36


In [5]:
# Custom test
context = "The capital of France is Paris."
question = "What is the capital of France?"
result = text_qa(question=question, context=context)
print(f"Question: {question}")
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['score']:.2f}")

Question: What is the capital of France?
Answer: Paris
Confidence: 0.99


In [6]:
context = "Photosynthesis uses sunlight to produce energy in plants."
question = "What does photosynthesis do?"
result = text_qa(question=question, context=context)
print(f"Question: {question}")
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['score']:.2f}")

Question: What does photosynthesis do?
Answer: produce energy in plants
Confidence: 0.17


## Observations
- DistilBERT answers questions accurately from SQuAD (loaded via pickle) and custom inputs.
- Confidence scores (e.g., 0.99) show the model is reliable for clear questions.
- This is the text-based core of the AI Tutor, ready to add voice and image features.