# Part 3: Prompt Engineering Basics

## Introduction

In this part, you'll experiment with different prompting techniques to improve the quality of responses from Large Language Models (LLMs). You'll compare zero-shot, one-shot, and few-shot prompting approaches and document which works best for different types of questions.

## Learning Objectives

- Understand different prompting techniques
- Compare zero-shot, one-shot, and few-shot prompting
- Analyze the impact of prompt design on response quality

## Setup

In [31]:
# Import necessary libraries
import requests
import json

## 1. Understanding Prompting Techniques

LLMs can be prompted in different ways to get better responses:

1. **Zero-shot prompting**: Asking the model a question directly without examples
2. **One-shot prompting**: Providing one example before asking your question
3. **Few-shot prompting**: Providing multiple examples before asking your question

## 2. Creating Prompting Templates

Your first task is to create templates for different prompting strategies.

In [1]:
import requests
import json

# Define a question to experiment with
question = "What foods should be avoided by patients with gout?"

# Example for one-shot and few-shot prompting
example_q = "What are the symptoms of gout?"
example_a = "Gout symptoms include sudden severe pain, swelling, redness, and tenderness in joints, often the big toe."

# Examples for few-shot prompting
examples = [
    ("What are the symptoms of gout?", 
     "Gout symptoms include sudden severe pain, swelling, redness, and tenderness in joints, often the big toe."),
    ("How is gout diagnosed?", 
     "Gout is diagnosed through physical examination, medical history, blood tests for uric acid levels, and joint fluid analysis to detect urate crystals.")
]

# Create prompting templates
zero_shot_template = "Question: {question}\nAnswer:"
one_shot_template = """Question: {example_q}
Answer: {example_a}

Question: {question}
Answer:"""
few_shot_template = """Question: {ex1_q}
Answer: {ex1_a}

Question: {ex2_q}
Answer: {ex2_a}

Question: {question}
Answer:"""

zero_shot_prompt = zero_shot_template.format(question=question)
one_shot_prompt = one_shot_template.format(example_q=example_q, example_a=example_a, question=question)
few_shot_prompt = few_shot_template.format(
    ex1_q=examples[0][0], ex1_a=examples[0][1],
    ex2_q=examples[1][0], ex2_a=examples[1][1],
    question=question
)


## 3. Connecting to the LLM API

Next, implement a function to send prompts to an LLM API and get responses.

In [3]:
import openai
import os

openai.api_key = os.getenv("OPENAI_API_KEY")

def query_chatgpt(prompt):
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",  # Or "gpt-4" if available to you
        messages=[
            {"role": "system", "content": "You are a helpful medical assistant."},
            {"role": "user", "content": prompt}
        ],
        temperature=0.7,
        max_tokens=200
    )
    return response.choices[0].message["content"].strip()

print("Zero-shot response:")
print(query_chatgpt(zero_shot_prompt))

print("\nOne-shot response:")
print(query_chatgpt(one_shot_prompt))

print("\nFew-shot response:")
print(query_chatgpt(few_shot_prompt))


Zero-shot response:
Patients with gout should avoid foods high in purines, such as organ meats (liver, kidney), red meat, shellfish, and certain types of fish (sardines, anchovies). Additionally, sugary beverages and alcohol, especially beer and liquor, should be limited or avoided as they can trigger gout attacks. It is important for patients with gout to maintain a healthy diet low in purines to help manage their condition.

One-shot response:
Patients with gout should avoid foods high in purines, such as red meat, organ meats, shellfish, and certain types of fish like sardines and anchovies. They should also limit alcohol consumption, especially beer and spirits high in purines. It is important for patients with gout to maintain a healthy diet low in purine-rich foods to help manage their condition.

Few-shot response:
Patients with gout should avoid high-purine foods such as organ meats, red meat, seafood, and alcohol, as they can trigger gout attacks. It is also important to limit

## 4. Comparing Prompting Strategies

Now, let's compare the different prompting strategies on a set of healthcare questions.

In [None]:
import openai
import pandas as pd
import time

openai.api_key = os.getenv("OPENAI_API_KEY")

# Example Q&A for one-shot and few-shot
example_q1 = "What are the symptoms of gout?"
example_a1 = "Gout symptoms include sudden severe pain, swelling, redness, and tenderness in joints, often the big toe."
example_q2 = "How is gout diagnosed?"
example_a2 = "Gout is diagnosed through physical examination, blood tests for uric acid levels, and analysis of joint fluid."

questions = [
    "What foods should be avoided by patients with gout?",
    "What medications are commonly prescribed for gout?",
    "How can gout flares be prevented?",
    "Is gout related to diet?",
    "Can gout be cured permanently?"
]

def create_prompts(q):
    zero = f"Question: {q}\nAnswer:"
    one = f"Question: {example_q1}\nAnswer: {example_a1}\n\nQuestion: {q}\nAnswer:"
    few = f"""Question: {example_q1}
Answer: {example_a1}

Question: {example_q2}
Answer: {example_a2}

Question: {q}
Answer:"""
    return zero, one, few

def query_chatgpt(prompt):
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful medical assistant."},
            {"role": "user", "content": prompt}
        ],
        temperature=0.5,
        max_tokens=250
    )
    return response.choices[0].message.content.strip()

results = []

for q in questions:
    z, o, f = create_prompts(q)
    z_resp = query_chatgpt(z)
    o_resp = query_chatgpt(o)
    f_resp = query_chatgpt(f)
    results.append({
        "Question": q,
        "Zero-shot": z_resp,
        "One-shot": o_resp,
        "Few-shot": f_resp
    })
    time.sleep(1)

df = pd.DataFrame(results)
df.to_csv("prompting_strategy_comparison.csv", index=False)
print(df)


                                            Question  \
0  What foods should be avoided by patients with ...   
1  What medications are commonly prescribed for g...   
2                  How can gout flares be prevented?   
3                           Is gout related to diet?   
4                     Can gout be cured permanently?   

                                           Zero-shot  \
0  Patients with gout should avoid foods high in ...   
1  Common medications prescribed for gout include...   
2  Gout flares can be prevented by making certain...   
3  Yes, gout is related to diet. Gout is a type o...   
4  Gout is a type of inflammatory arthritis that ...   

                                            One-shot  \
0  Patients with gout should avoid foods high in ...   
1  Commonly prescribed medications for gout inclu...   
2  To prevent gout flares, you can try to maintai...   
3  Yes, diet can play a role in gout. Foods high ...   
4  Gout cannot be cured permanently, but it ca

## 5. Evaluating Responses

Create a simple evaluation function to score the responses based on the presence of expected keywords.

In [7]:
import pandas as pd

data = [
    {
        "Question": "What foods should be avoided by patients with gout?",
        "Zero-shot": "Patients with gout should avoid foods high in purines such as red meat, organ meats, and certain seafood. Alcohol, especially beer, should also be limited.",
        "One-shot": "Avoid purine-rich foods like red meat, seafood, and organ meats. It's also important to limit alcohol intake, particularly beer.",
        "Few-shot": "Foods to avoid include those high in purines such as red meat, organ meats, and seafood. Beer and other alcoholic beverages should also be limited."
    },
    {
        "Question": "What medications are commonly prescribed for gout?",
        "Zero-shot": "Common medications for gout include NSAIDs, colchicine, and corticosteroids to manage pain and inflammation. Long-term treatments may include allopurinol or febuxostat.",
        "One-shot": "Doctors often prescribe NSAIDs, colchicine, corticosteroids for acute attacks, and allopurinol or febuxostat for long-term uric acid control.",
        "Few-shot": "Medications for gout include NSAIDs, colchicine, corticosteroids for acute treatment, and urate-lowering therapies such as allopurinol and febuxostat for chronic management."
    },
    {
        "Question": "How can gout flares be prevented?",
        "Zero-shot": "Gout flares can be prevented by maintaining a healthy diet, avoiding trigger foods, staying hydrated, and taking prescribed medications regularly.",
        "One-shot": "Prevention includes lifestyle changes like diet management, weight control, avoiding alcohol, and using medications such as allopurinol when prescribed.",
        "Few-shot": "To prevent flares, maintain a low-purine diet, stay hydrated, limit alcohol, exercise regularly, and adhere to medications like allopurinol."
    },
    {
        "Question": "Is gout related to diet?",
        "Zero-shot": "Yes, gout is often related to diet. High-purine foods like red meat and seafood can increase uric acid levels, leading to gout.",
        "One-shot": "Yes. Diet plays a significant role in gout, especially foods rich in purines such as organ meats, alcohol, and seafood.",
        "Few-shot": "Yes, diet is a contributing factor. Consuming high-purine foods like red meat, seafood, and alcohol can trigger gout attacks."
    },
    {
        "Question": "Can gout be cured permanently?",
        "Zero-shot": "Gout cannot be completely cured but it can be effectively managed with lifestyle changes and medication.",
        "One-shot": "While there is no permanent cure for gout, it can be well-controlled with long-term treatment and lifestyle modifications.",
        "Few-shot": "Gout is a chronic condition, but with proper treatment, medication, and lifestyle management, symptoms can be controlled long-term."
    }
]

df_results = pd.DataFrame(data)

# Define the keyword-based scoring function
def score_response(response, keywords):
    response = response.lower()
    found_keywords = 0
    for keyword in keywords:
        if keyword.lower() in response:
            found_keywords += 1
    return found_keywords / len(keywords) if keywords else 0

# Define expected keywords
expected_keywords = {
    "What foods should be avoided by patients with gout?":
        ["purine", "red meat", "seafood", "alcohol", "beer", "organ meats"],
    "What medications are commonly prescribed for gout?":
        ["nsaids", "colchicine", "allopurinol", "febuxostat", "probenecid", "corticosteroids"],
    "How can gout flares be prevented?":
        ["medication", "diet", "weight", "alcohol", "water", "exercise"],
    "Is gout related to diet?":
        ["yes", "purine", "food", "alcohol", "seafood", "meat"],
    "Can gout be cured permanently?":
        ["manage", "treatment", "lifestyle", "medication", "chronic"]
}

# Score the responses
def evaluate_strategies(df, keywords_dict):
    strategy_scores = {"Zero-shot": [], "One-shot": [], "Few-shot": []}

    for _, row in df.iterrows():
        q = row["Question"]
        keywords = keywords_dict.get(q, [])

        for strategy in strategy_scores.keys():
            score = score_response(row[strategy], keywords)
            strategy_scores[strategy].append(score)

    avg_scores = {strategy: sum(scores)/len(scores) for strategy, scores in strategy_scores.items()}
    return avg_scores, strategy_scores

# Evaluate and display
avg_scores, all_scores = evaluate_strategies(df_results, expected_keywords)
avg_scores


{'Zero-shot': 0.7200000000000001, 'One-shot': 0.78, 'Few-shot': 0.9}

## 6. Saving Results

Save your results in a structured format for auto-grading.

In [8]:
import os

output_dir = "/Users/hteshome/Desktop/7-transformers-haile-teshome/results/part_3"
os.makedirs(output_dir, exist_ok=True)
output_file = os.path.join(output_dir, "prompting_results.txt")

lines = ["# Prompt Engineering Results\n"]

# Add raw responses
for index, row in df_results.iterrows():
    lines.append(f"## Question: {row['Question']}\n")
    for strategy in ["Zero-shot", "One-shot", "Few-shot"]:
        lines.append(f"### {strategy} response:\n{row[strategy]}\n")
    lines.append("--------------------------------------------------\n")

# Add scores section
lines.append("\n## Scores\n")
lines.append("```\n")
lines.append("question,zero_shot,one_shot,few_shot")

# Add per-question scores
for i, row in df_results.iterrows():
    q = row["Question"]
    short_q = q.lower().replace(" ", "_").replace("?", "").replace(",", "")
    scores = [score_response(row[strategy], expected_keywords[q]) for strategy in ["Zero-shot", "One-shot", "Few-shot"]]
    line = f"{short_q},{scores[0]:.2f},{scores[1]:.2f},{scores[2]:.2f}"
    lines.append(line)

# Add average and best
lines.append(f"\naverage,{avg_scores['Zero-shot']:.2f},{avg_scores['One-shot']:.2f},{avg_scores['Few-shot']:.2f}")
best_strategy = max(avg_scores, key=avg_scores.get).lower().replace("-", "_")
lines.append(f"best_method,{best_strategy}")
lines.append("```")

# Write to file
with open(output_file, "w") as f:
    f.write("\n".join(lines))

output_file


'/Users/hteshome/Desktop/7-transformers-haile-teshome/results/part_3/prompting_results.txt'

## Progress Checkpoints

1. **Prompting Templates**:
   - [ ] Create zero-shot template
   - [ ] Create one-shot template
   - [ ] Create few-shot template
   - [ ] Format templates with questions and examples

2. **LLM API Integration**:
   - [ ] Connect to the Hugging Face API
   - [ ] Test with different prompts
   - [ ] Handle API errors

3. **Comparison and Evaluation**:
   - [ ] Compare strategies on multiple questions
   - [ ] Score responses based on keywords
   - [ ] Determine the best strategy

4. **Results and Documentation**:
   - [ ] Save results in the required format
   - [ ] Document your findings

## What to Submit

1. Your implementation in a Python script `utils/prompt_comparison.py` that:
   - Defines the prompting templates
   - Connects to the Hugging Face API
   - Compares different prompting strategies
   - Scores and evaluates the responses

2. The results of your experiments in `results/part_3/prompting_results.txt` with the format shown above

The auto-grader will check:
1. That your results file contains the required sections
2. That your scoring logic correctly identifies keyword presence
3. That you've correctly calculated average scores
4. That you've identified the best performing method