# Day 21: Prompt Engineering Patterns - Part 1

In this notebook, we'll implement and experiment with zero-shot and few-shot prompting techniques. These are fundamental prompt engineering patterns that can significantly improve language model performance without requiring model fine-tuning.

## Overview

We'll cover:
1. Setting up the environment
2. Zero-shot prompting implementation
3. Few-shot prompting implementation
4. Comparing the effectiveness of different approaches

## 1. Setting Up the Environment

First, let's install and import the necessary libraries. We'll use OpenAI's API for this demonstration, but the concepts apply to any language model.

In [None]:
# Install required packages
!pip install openai python-dotenv requests matplotlib pandas

In [None]:
import os
import openai
import json
import pandas as pd
import matplotlib.pyplot as plt
import time
from dotenv import load_dotenv

# Load environment variables (API keys)
load_dotenv()

# Set up OpenAI API
openai.api_key = os.getenv("OPENAI_API_KEY")

# If you don't have an API key, you can use this function to simulate API calls
def simulate_llm_call(prompt, model="gpt-3.5-turbo", temperature=0.7):
    """Simulate a call to a language model API for demonstration purposes."""
    print("\n--- Prompt ---\n")
    print(prompt)
    print("\n--- [Simulated LLM Response] ---\n")
    
    # Simulate different responses based on the prompt
    if "sentiment" in prompt.lower():
        if "amazing" in prompt.lower() or "love" in prompt.lower():
            return "positive"
        elif "terrible" in prompt.lower() or "hate" in prompt.lower():
            return "negative"
        else:
            return "neutral"
    elif "classify" in prompt.lower():
        if "animal" in prompt.lower():
            return "animal"
        elif "technology" in prompt.lower() or "computer" in prompt.lower():
            return "technology"
        else:
            return "other"
    else:
        return "This is a simulated response. Please provide your OpenAI API key for actual responses."

# Function to call OpenAI API
def call_llm(prompt, model="gpt-3.5-turbo", temperature=0.7):
    """Call the language model API with the given prompt."""
    try:
        if not openai.api_key or openai.api_key == "your-api-key-here":
            return simulate_llm_call(prompt, model, temperature)
        
        response = openai.ChatCompletion.create(
            model=model,
            messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": prompt}
            ],
            temperature=temperature
        )
        return response.choices[0].message.content
    except Exception as e:
        print(f"Error calling LLM API: {e}")
        return simulate_llm_call(prompt, model, temperature)

## 2. Zero-Shot Prompting

Zero-shot prompting involves asking the model to perform a task without providing any examples. The model relies entirely on its pre-training knowledge and the instructions in the prompt.

In [None]:
def zero_shot_prompt(task_description, input_data, output_format=""):
    """Create a zero-shot prompt with clear instructions."""
    prompt = f"{task_description}\n\n{input_data}"
    
    if output_format:
        prompt += f"\n\n{output_format}"
    
    return prompt

### 2.1 Zero-Shot Sentiment Analysis

Let's implement a zero-shot prompt for sentiment analysis.

In [None]:
# Example texts for sentiment analysis
sentiment_texts = [
    "I love this new restaurant! The food was amazing and the service was excellent.",
    "This movie was terrible. I hated every minute of it.",
    "The weather is okay today, nothing special.",
    "I'm so excited about my vacation next week!",
    "The new software update has some interesting features."
]

# Create and test zero-shot sentiment analysis prompts
for text in sentiment_texts:
    task_description = "Classify the sentiment of the following text as positive, negative, or neutral."
    input_data = f"Text: {text}"
    output_format = "Sentiment:"
    
    prompt = zero_shot_prompt(task_description, input_data, output_format)
    response = call_llm(prompt, temperature=0.1)  # Low temperature for consistent results
    
    print(f"Text: {text}")
    print(f"Sentiment: {response}")
    print("-" * 50)

### 2.2 Zero-Shot Text Classification

Now let's try a different task: classifying text into categories.

In [None]:
# Example texts for classification
classification_texts = [
    "The new iPhone 13 has an improved camera and longer battery life.",
    "Elephants are the largest land animals and can live up to 70 years.",
    "The Great Barrier Reef is the world's largest coral reef system.",
    "Python is a popular programming language known for its simplicity and readability.",
    "The Eiffel Tower was completed in 1889 and stands at 324 meters tall."
]

# Categories
categories = ["Technology", "Animals", "Nature", "History"]

# Create and test zero-shot classification prompts
for text in classification_texts:
    task_description = f"Classify the following text into one of these categories: {', '.join(categories)}."
    input_data = f"Text: {text}"
    output_format = "Category:"
    
    prompt = zero_shot_prompt(task_description, input_data, output_format)
    response = call_llm(prompt, temperature=0.1)
    
    print(f"Text: {text}")
    print(f"Category: {response}")
    print("-" * 50)

### 2.3 Zero-Shot Question Answering

Let's try zero-shot prompting for question answering tasks.

In [None]:
# Example context and questions
context = """
The James Webb Space Telescope (JWST) is a space telescope designed primarily to conduct infrared astronomy. 
The U.S. National Aeronautics and Space Administration (NASA) led development of the telescope in collaboration 
with the European Space Agency (ESA) and the Canadian Space Agency (CSA). The telescope is named after James E. Webb, 
who was the administrator of NASA from 1961 to 1968 during the Mercury, Gemini, and Apollo programs.

The telescope was launched on 25 December 2021 on an Ariane 5 rocket from Kourou, French Guiana, and arrived at 
the Sun–Earth L2 Lagrange point in January 2022. The first JWST image was released to the public on July 11, 2022.
"""

questions = [
    "When was the James Webb Space Telescope launched?",
    "Who is the telescope named after?",
    "Which space agencies collaborated on the JWST?",
    "Where is the JWST located?"
]

# Create and test zero-shot QA prompts
for question in questions:
    task_description = "Answer the question based on the given context."
    input_data = f"Context: {context}\n\nQuestion: {question}"
    output_format = "Answer:"
    
    prompt = zero_shot_prompt(task_description, input_data, output_format)
    response = call_llm(prompt)
    
    print(f"Question: {question}")
    print(f"Answer: {response}")
    print("-" * 50)

## 3. Few-Shot Prompting

Few-shot prompting provides the model with a few examples of the desired input-output pattern before presenting the actual task.

In [None]:
def few_shot_prompt(task_description, examples, input_data, output_format=""):
    """Create a few-shot prompt with examples."""
    prompt = f"{task_description}\n\n"
    
    # Add examples
    for example in examples:
        prompt += f"{example['input']}\n{example['output']}\n\n"
    
    # Add the actual input
    prompt += f"{input_data}"
    
    if output_format:
        prompt += f"\n{output_format}"
    
    return prompt

### 3.1 Few-Shot Sentiment Analysis

Let's implement few-shot prompting for sentiment analysis.

In [None]:
# Examples for sentiment analysis
sentiment_examples = [
    {
        "input": "Text: This movie was terrible. I hated every minute of it.",
        "output": "Sentiment: negative"
    },
    {
        "input": "Text: The weather is okay today, nothing special.",
        "output": "Sentiment: neutral"
    },
    {
        "input": "Text: I'm so excited about my vacation next week!",
        "output": "Sentiment: positive"
    }
]

# New texts for sentiment analysis
new_sentiment_texts = [
    "The customer service was absolutely appalling.",
    "I might go to the park later if it doesn't rain.",
    "The concert last night was the best I've ever been to!"
]

# Create and test few-shot sentiment analysis prompts
for text in new_sentiment_texts:
    task_description = "Classify the sentiment of the following texts as positive, negative, or neutral."
    input_data = f"Text: {text}"
    output_format = "Sentiment:"
    
    prompt = few_shot_prompt(task_description, sentiment_examples, input_data, output_format)
    response = call_llm(prompt, temperature=0.1)
    
    print(f"Text: {text}")
    print(f"Sentiment: {response}")
    print("-" * 50)

### 3.2 Few-Shot Text Classification

Now let's try few-shot prompting for text classification.

In [None]:
# Examples for classification
classification_examples = [
    {
        "input": "Text: The new iPhone 13 has an improved camera and longer battery life.",
        "output": "Category: Technology"
    },
    {
        "input": "Text: Elephants are the largest land animals and can live up to 70 years.",
        "output": "Category: Animals"
    },
    {
        "input": "Text: The Great Barrier Reef is the world's largest coral reef system.",
        "output": "Category: Nature"
    }
]

# New texts for classification
new_classification_texts = [
    "The Roman Empire reached its peak territorial expansion under Emperor Trajan.",
    "Quantum computers use qubits instead of traditional binary bits.",
    "Dolphins are highly intelligent marine mammals known for their playful behavior.",
    "The Amazon Rainforest produces about 20% of the world's oxygen."
]

# Create and test few-shot classification prompts
for text in new_classification_texts:
    task_description = "Classify the following texts into one of these categories: Technology, Animals, Nature, History."
    input_data = f"Text: {text}"
    output_format = "Category:"
    
    prompt = few_shot_prompt(task_description, classification_examples, input_data, output_format)
    response = call_llm(prompt, temperature=0.1)
    
    print(f"Text: {text}")
    print(f"Category: {response}")
    print("-" * 50)

### 3.3 Few-Shot Named Entity Recognition

Let's implement few-shot prompting for a more complex task: named entity recognition.

In [None]:
# Examples for named entity recognition
ner_examples = [
    {
        "input": "Text: Apple is looking at buying U.K. startup for $1 billion.",
        "output": "Entities: [\n  {\"entity\": \"Apple\", \"type\": \"ORG\"},\n  {\"entity\": \"U.K.\", \"type\": \"GPE\"},\n  {\"entity\": \"$1 billion\", \"type\": \"MONEY\"}\n]"
    },
    {
        "input": "Text: Microsoft was founded by Bill Gates and Paul Allen in April 1975.",
        "output": "Entities: [\n  {\"entity\": \"Microsoft\", \"type\": \"ORG\"},\n  {\"entity\": \"Bill Gates\", \"type\": \"PERSON\"},\n  {\"entity\": \"Paul Allen\", \"type\": \"PERSON\"},\n  {\"entity\": \"April 1975\", \"type\": \"DATE\"}\n]"
    }
]

# New texts for NER
new_ner_texts = [
    "Amazon announced a new office in Seattle that will employ 2,000 people.",
    "The Golden State Warriors won the NBA championship in June 2022.",
    "Tesla's CEO Elon Musk visited Berlin last Thursday."
]

# Create and test few-shot NER prompts
for text in new_ner_texts:
    task_description = "Extract named entities from the text and classify them as PERSON, ORG (organization), GPE (geopolitical entity), DATE, or MONEY."
    input_data = f"Text: {text}"
    output_format = "Entities:"
    
    prompt = few_shot_prompt(task_description, ner_examples, input_data, output_format)
    response = call_llm(prompt)
    
    print(f"Text: {text}")
    print(f"Entities: {response}")
    print("-" * 50)

## 4. Comparing Zero-Shot and Few-Shot Performance

Let's compare the performance of zero-shot and few-shot prompting on a sentiment analysis task.

In [None]:
# Test dataset with ground truth labels
test_data = [
    {"text": "This product exceeded all my expectations!", "sentiment": "positive"},
    {"text": "The service was slow and the staff was rude.", "sentiment": "negative"},
    {"text": "It was an average experience, nothing special.", "sentiment": "neutral"},
    {"text": "I wouldn't recommend this to anyone.", "sentiment": "negative"},
    {"text": "The price is reasonable for what you get.", "sentiment": "neutral"},
    {"text": "Best purchase I've made all year!", "sentiment": "positive"}
]

# Function to evaluate accuracy
def evaluate_accuracy(predictions, ground_truth):
    correct = sum(1 for p, g in zip(predictions, ground_truth) if p.lower().strip() == g.lower())
    return correct / len(ground_truth) if ground_truth else 0

# Run zero-shot and few-shot prompting on test data
zero_shot_results = []
few_shot_results = []
ground_truth = [item["sentiment"] for item in test_data]

for item in test_data:
    text = item["text"]
    
    # Zero-shot
    zero_shot_prompt_text = zero_shot_prompt(
        "Classify the sentiment of the following text as positive, negative, or neutral.",
        f"Text: {text}",
        "Sentiment:"
    )
    zero_shot_response = call_llm(zero_shot_prompt_text, temperature=0.1)
    zero_shot_results.append(zero_shot_response)
    
    # Few-shot
    few_shot_prompt_text = few_shot_prompt(
        "Classify the sentiment of the following texts as positive, negative, or neutral.",
        sentiment_examples,
        f"Text: {text}",
        "Sentiment:"
    )
    few_shot_response = call_llm(few_shot_prompt_text, temperature=0.1)
    few_shot_results.append(few_shot_response)
    
    # Add a small delay to avoid rate limits
    time.sleep(1)

# Calculate accuracy
zero_shot_accuracy = evaluate_accuracy(zero_shot_results, ground_truth)
few_shot_accuracy = evaluate_accuracy(few_shot_results, ground_truth)

print(f"Zero-shot accuracy: {zero_shot_accuracy:.2f}")
print(f"Few-shot accuracy: {few_shot_accuracy:.2f}")

# Create a comparison table
comparison_data = []
for i, item in enumerate(test_data):
    comparison_data.append({
        "Text": item["text"],
        "Ground Truth": item["sentiment"],
        "Zero-Shot": zero_shot_results[i],
        "Few-Shot": few_shot_results[i]
    })

comparison_df = pd.DataFrame(comparison_data)
display(comparison_df)

# Visualize the results
plt.figure(figsize=(10, 6))
plt.bar(["Zero-Shot", "Few-Shot"], [zero_shot_accuracy, few_shot_accuracy], color=["blue", "green"])
plt.ylim(0, 1.1)
plt.ylabel("Accuracy")
plt.title("Comparison of Zero-Shot vs. Few-Shot Prompting")
for i, v in enumerate([zero_shot_accuracy, few_shot_accuracy]):
    plt.text(i, v + 0.05, f"{v:.2f}", ha="center")
plt.show()

## 5. Conclusion

In this notebook, we've explored and implemented zero-shot and few-shot prompting techniques. Here are the key takeaways:

1. **Zero-shot prompting** works well for simple, well-defined tasks where the model has sufficient pre-training knowledge.

2. **Few-shot prompting** generally improves performance by providing examples of the desired input-output pattern, especially for complex or ambiguous tasks.

3. **Prompt structure matters**: Clear instructions, well-formatted examples, and explicit output formats can significantly improve results.

In the next part, we'll explore more advanced prompting techniques including chain-of-thought reasoning and self-consistency methods.