# Session 3: Prompt Engineering Lab
## Zero-Shot, Few-Shot, and Chain of Thought

**Objective:** Practically demonstrate the difference between asking a model to "guess" (Zero-shot) vs. teaching it with examples (Few-shot), and fixing logic errors with CoT.

**Dataset:** Simulated University Support Tickets.

In [10]:
import ollama
import pandas as pd

MODEL = "qwen3:0.6b"
#MODEL = "llama3.2:latest"

# The Data
tickets = [
    {"id": 101, "text": "I cannot access Moodle to submit my assignment.", "true_label": "IT"},
    {"id": 102, "text": "When is the deadline to pay the second tuition installment?", "true_label": "Finance"},
    {"id": 103, "text": "The projector in Room 304 has a broken bulb.", "true_label": "Facilities"},
    {"id": 104, "text": "I need an official transcript for my scholarship application.", "true_label": "Registrar"},
    {"id": 105, "text": "My wifi login credentials aren't working in the library.", "true_label": "IT"}
]

df = pd.DataFrame(tickets)

### Experiment 1: Zero-Shot Classification
We ask the model to classify without providing examples of *how* to do it.

In [11]:
def classify_zero_shot(text):
    prompt = f"""
    Classify the following support ticket into one category: [IT, Finance, Facilities, Registrar].
    Reply ONLY with the category name.

    Ticket: "{text}"
    Category:

    write only the ticket category and nothing else
    """
    response = ollama.generate(model=MODEL, prompt=prompt, options={'temperature': 0.0})
    return response['response'].strip()

# Run Experiment
df['zero_shot_pred'] = df['text'].apply(classify_zero_shot)
def color_correct(row):
    color = 'background-color: lightgreen' if row['zero_shot_pred'] in row['true_label'] else 'background-color: lightcoral'
    return [color] * len(row)

df[['text', 'true_label', 'zero_shot_pred']].style.apply(color_correct, axis=1)

Unnamed: 0,text,true_label,zero_shot_pred
0,I cannot access Moodle to submit my assignment.,IT,IT
1,When is the deadline to pay the second tuition installment?,Finance,Finance
2,The projector in Room 304 has a broken bulb.,Facilities,IT
3,I need an official transcript for my scholarship application.,Registrar,Registrar
4,My wifi login credentials aren't working in the library.,IT,IT


### Experiment 2: Few-Shot Classification
We provide 3 examples (shots) in the prompt to guide the model's pattern recognition.

In [5]:
def classify_few_shot(text):
    prompt = f"""
    Classify the ticket into: [IT, Finance, Facilities, Registrar].

    Example 1:
    Ticket: "The internet is down in the lab."
    Category: IT

    Example 2:
    Ticket: "How do I request a refund?"
    Category: Finance

    Example 3:
    Ticket: "The desk in hall B is broken."
    Category: Facilities

    Task:
    Ticket: "{text}"
    Category:

    write only the ticket category and nothing else
    """
    response = ollama.generate(model=MODEL, prompt=prompt, options={'temperature': 0.0})
    return response['response'].strip()

# Run Experiment
df['few_shot_pred'] = df['text'].apply(classify_few_shot)

# Visualization: Color code correct answers
def color_correct(row):
    color = 'background-color: lightgreen' if row['few_shot_pred'] in row['true_label'] else 'background-color: lightcoral'
    return [color] * len(row)

df[['text', 'true_label', 'few_shot_pred']].style.apply(color_correct, axis=1)

Unnamed: 0,text,true_label,few_shot_pred
0,I cannot access Moodle to submit my assignment.,IT,IT
1,When is the deadline to pay the second tuition installment?,Finance,Finance
2,The projector in Room 304 has a broken bulb.,Facilities,Facilities
3,I need an official transcript for my scholarship application.,Registrar,Registrar
4,My wifi login credentials aren't working in the library.,IT,IT


### Experiment 3: Chain of Thought (CoT)
Demonstrating how reasoning steps improve math/logic handling.

**Problem:** Calculating FTE (Full-Time Equivalent) overload.

In [9]:
math_prompt = """
Prof X teaches 3 courses (4 credits each).
Prof Y teaches 2 courses (5 credits each).
A standard load is 10 credits.
Who has the higher overload and by how much?
"""

print("--- Standard Prompt ---")
res_std = ollama.generate(model=MODEL, prompt=math_prompt, options={'temperature': 0.1})
print(res_std['response'])

print("\n--- Chain of Thought Prompt (Let's think step by step) ---")
cot_prompt = math_prompt + "\nLet's think step by step. Show your math."
res_cot = ollama.generate(model=MODEL, prompt=cot_prompt, options={'temperature': 0.1})
print(res_cot['response'])

--- Standard Prompt ---
To calculate the overload, we need to find out how many credits each professor is taking above their standard load.

Prof X: Standard load = 3 x 4 = 12 credits
Overload = 16 - 12 = 4 credits (since they are taking 16 credits)

Prof Y: Standard load = 2 x 5 = 10 credits
Overload = 15 - 10 = 5 credits

Since Prof X has an overload of 4 credits and Prof Y has an overload of 5 credits, Prof Y has the higher overload by 1 credit.

--- Chain of Thought Prompt (Let's think step by step) ---
To find out who has the higher overload, we need to calculate the total number of credits for each professor.

Prof X teaches 3 courses with 4 credits each:
Total credits = 3 x 4 = 12 credits

Since a standard load is 10 credits, Prof X's overload is:
Overload = Total credits - Standard load
= 12 - 10
= 2 credits

Prof Y teaches 2 courses with 5 credits each:
Total credits = 2 x 5 = 10 credits

Again, since a standard load is 10 credits, Prof Y has no overload.

Since Prof X has an 

**Conclusion:** 
Notice how the **CoT** response usually breaks down `3 * 4 = 12` explicitly, preventing hallucination errors common in smaller models.