# QA Baseline without RAG

This notebook is a baseline for the QA task without the RAG model. For a fair comparison, we choose the same backbone model as the one in the RAG pipeline: the `meta/llama3.1-8b-Instruct` model. We also adopt the same data type (fp16) and the same config for setting up the tokenizer. We use the same prompt format as the one in the RAG pipeline.

In [1]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_name = "meta-llama/Llama-3.1-8B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")

tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"

generation_pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer, 
    torch_dtype=torch.float16
)

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

In [3]:
# Step 3: load qa annotation test set
import pandas as pd
qa_df = pd.read_csv("../../data/annotated/generated_qa_pairs_3000_test20.csv")

questions = qa_df["Question"].tolist()
answers = qa_df["Answer"].tolist()

# random sample 10 qa pairs
import random
sample_size = 10
random.seed(221)
sample_indices = random.sample(range(len(questions)), sample_size)
sample_questions = [questions[i] for i in sample_indices]
sample_answers = [answers[i] for i in sample_indices]

In [7]:
template = """
You are an expert assistant answering factual questions about various aspects of Pittsburgh or Carnegie Mellon University (CMU), including history, policy, culture, events, and more. 
If you do not know the answer, just say "I don't know."

Important Instructions:
- Answer concisely without repeating the question.
- Do **not** use complete sentences. Provide only the word, name, date, or phrase that directly answers the question. For example, given the question "When was Carnegie Mellon University founded?", you should only answer "1900".

Examples:
Question: Who is Pittsburgh named after? 
Answer: William Pitt
Question: What famous machine learning venue had its first conference in Pittsburgh in 1980? 
Answer: ICML
Question: What musical artist is performing at PPG Arena on October 13? 
Answer: Billie Eilish

Question: {question} \n\n
Answer:
"""

In [10]:
# use the template the generate the answers
generated_answers = []
for question in sample_questions:
    full_prompt = template.format(question=question)
    messages = [
        {"role": "user", "content": full_prompt},
        ]
    output = generation_pipe(messages, max_new_tokens=50)
    generated_answers.append(output[0]["generated_text"][1]['content'])  

In [11]:
print(generated_answers)
print(sample_answers)

['Opera Guild', 'WYO', 'Family Friendly Opera', '1946', 'Robin Williams', "I don't know", 'David L. Lawrence Convention Center', "I don't know.", 'Pennsylvania', '1, 138,000']
['Monteverdi Society', 'Pittsburgh Jazz, Blues, and Bluegrass', 'La Traviata', '2024', 'Adrian Cronauer', 'New Jersey Devils', "The city's convention center", '4.5 seconds', 'Pennsylvania', '1,846,000']
