# Advanced Prompt Engineering Techniques

# Advanced Prompt Engineering Techniques

Welcome to the hands-on exploration of advanced prompt engineering! In this notebook, you'll see these techniques in action and understand why prompt engineering is an **empirical science** that requires experimentation and iteration.

## What You'll Learn

- Core prompting techniques (zero-shot, few-shot, one-shot, role, emotional, chain-of-thought)
- Why LLM outputs are non-deterministic and how to control consistency
- Automatic prompt generation and meta-prompting strategies
- Token optimization techniques to reduce costs
- Model biases and how they affect outputs
- Using LLMs as judges to evaluate responses

## Prerequisites

Before starting, make sure you have:
- OpenAI API key set up
- Basic understanding of prompt engineering fundamentals (see lesson m2_01)
- Python environment with openai library installed

Let's dive in!

## 1. Setup & Configuration

First, let's import the necessary libraries and configure our OpenAI client.

In [None]:
# Import required libraries
import os
from openai import OpenAI
import json
from collections import Counter
import time
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Initialize OpenAI client
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

# Set default model
MODEL = "gpt-4o-mini"

print("‚úÖ Setup complete! OpenAI client initialized.")

‚úÖ Setup complete! OpenAI client initialized.


### Helper Functions

Let's create some utility functions to make our experiments easier to manage and visualize.

In [2]:
def call_openai(prompt, system_message="You are a helpful assistant.", temperature=0.7, max_tokens=None, seed=None):
    """Helper function to call OpenAI API with specified parameters"""
    try:
        params = {
            "model": MODEL,
            "messages": [
                {"role": "system", "content": system_message},
                {"role": "user", "content": prompt}
            ],
            "temperature": temperature
        }
        
        if max_tokens:
            params["max_tokens"] = max_tokens
        if seed is not None:
            params["seed"] = seed
            
        response = client.chat.completions.create(**params)
        return response.choices[0].message.content
    except Exception as e:
        return f"Error: {str(e)}"

def display_response(title, response, params=None):
    """Display a response with formatting"""
    print(f"\n{'='*60}")
    print(f"üìù {title}")
    if params:
        print(f"Parameters: {params}")
    print(f"{'='*60}")
    print(response)
    print(f"{'='*60}\n")

def count_tokens_approx(text):
    """Approximate token count (roughly 4 characters per token)"""
    return len(text) // 4

print("‚úÖ Helper functions loaded!")

‚úÖ Helper functions loaded!


## 2. Core Prompting Techniques

Let's explore the fundamental prompting techniques you learned about in the lesson. Each example demonstrates when and how to use these techniques effectively.

### Zero-Shot Prompting

Zero-shot prompting provides a clear, detailed description of what you need **without examples**. This is the most commonly used prompting technique in practice.

In [3]:
# Zero-shot: Just a clear instruction
prompt = "Write a haiku about AI Consulting and Integration"

response = call_openai(prompt, temperature=0.7)
display_response("Zero-Shot Prompting", response)


üìù Zero-Shot Prompting
Whispers of data,  
Guiding paths through code and light,  
Future's hand in ours.  



### One-Shot Prompting

One-shot prompting provides **a single example** to demonstrate the expected output format. This helps the model understand the pattern you want.

In [27]:
"""Q: What is prompt engineering?
A: Prompt engineering is the practice of crafting effective input prompts to guide AI models toward desired outputs or behaviors.

Q: Why is prompt engineering useful?
A:
"""

response = call_openai(prompt, temperature=0.4)
display_response("One-Shot Prompting", response)


üìù One-Shot Prompting
Sure! Let's break it down into simple concepts.

1. **Bits vs. Qubits**: In classical computing, information is stored in bits, which can be either a 0 or a 1. Quantum computing uses qubits, which can be both 0 and 1 at the same time due to a property called superposition. This allows quantum computers to process a lot of information simultaneously.

2. **Superposition**: Imagine a spinning coin. While it's spinning, it can be thought of as both heads and tails at the same time. In quantum computing, qubits can exist in multiple states at once, which helps in solving complex problems more efficiently.

3. **Entanglement**: This is another key property of qubits. When qubits become entangled, the state of one qubit is directly related to the state of another, no matter how far apart they are. This means that changing one qubit can instantly affect its entangled partner, allowing for faster and more complex computations.

4. **Quantum Gates**: Just like classical

### Few-Shot Prompting

Few-shot prompting adds **multiple examples** of the desired output format to guide the model's response, helping it understand the pattern more clearly.

In [5]:
# Few-shot: Provide multiple examples
prompt = """Q: What is prompt engineering?
A: Prompt engineering is the practice of crafting effective input prompts to guide AI models toward desired outputs or behaviors.

Q: Why is prompt engineering useful?
A: Prompt engineering optimizes AI model outputs, enhancing accuracy, relevance, and task-specific performance.

Q: What is temperature in AI models?
A: Temperature controls the randomness of AI outputs, with lower values producing consistent results and higher values generating creative variations.

Q: What is chain-of-thought prompting?
A:"""

response = call_openai(prompt, temperature=0.7)
display_response("Few-Shot Prompting", response)


üìù Few-Shot Prompting
Chain-of-thought prompting is a technique used in AI prompting where the user encourages the model to break down its reasoning process into a sequence of logical steps. This method helps the model articulate its thought process, leading to more coherent and accurate responses, particularly for complex tasks that require reasoning or multi-step solutions. By explicitly asking the model to think through a problem step-by-step, users can often achieve better results and gain insights into the model's reasoning.



### Role Prompting

Role prompting asks the model to play a specific role or persona. Research shows it's more effective to define the assistant in **third person** ("they are a senior data scientist") rather than first person ("you are a senior data scientist").

In [6]:
# Role prompting: Third person (more effective)
system_message = "The assistant is a senior data scientist with 15 years of experience in machine learning and statistical analysis."
prompt = "Explain the bias-variance tradeoff in simple terms."

response = call_openai(prompt, system_message=system_message, temperature=0.7)
display_response("Role Prompting (Third Person)", response, {"role": "Senior Data Scientist"})


üìù Role Prompting (Third Person)
Parameters: {'role': 'Senior Data Scientist'}
The bias-variance tradeoff is a key concept in machine learning that helps us understand the sources of error in our models and how to balance them for better predictions.

1. **Bias**: This refers to the error introduced by approximating a real-world problem, which may be complex, with a simpler model. A model with high bias pays little attention to the training data and oversimplifies the problem. This can lead to systematic errors in predictions, meaning the model might consistently miss the mark (e.g., underfitting). For example, using a straight line to fit data that follows a curve can create high bias.

2. **Variance**: This is the error introduced by the model's sensitivity to small fluctuations in the training data. A model with high variance pays too much attention to the training data, capturing noise along with the underlying patterns. This can lead to overfitting, where the model performs ver

### Emotional Prompting

Emotional prompting uses emotional stimuli to help the LLM better handle emotion and tone in its responses.

In [7]:
# Emotional prompting: Adding emotional context
prompt = """This is very important to my career. I need you to explain AI ethics in a way that will help me 
succeed in my upcoming presentation. Please be thorough and thoughtful."""

response = call_openai(prompt, temperature=0.7)
display_response("Emotional Prompting", response)


üìù Emotional Prompting
Certainly! Understanding AI ethics is crucial for anyone working with artificial intelligence, as it encompasses the moral principles guiding the development and deployment of AI technologies. Here‚Äôs a comprehensive breakdown that should help you prepare for your presentation:

### Introduction to AI Ethics

AI ethics refers to a set of principles and guidelines that govern the responsible creation and use of artificial intelligence. As AI systems are increasingly integrated into various aspects of society‚Äîfrom healthcare to finance to criminal justice‚Äîit's essential to address the ethical implications of these technologies to ensure they benefit society as a whole.

### Key Ethical Principles in AI

1. **Transparency**:
   - **Definition**: AI systems should be transparent about how they operate and make decisions.
   - **Importance**: Transparency helps build trust with users and stakeholders. It allows individuals to understand and question AI decisio

### Chain-of-Thought Prompting

Chain-of-thought provides **step-by-step reasoning instructions**, enabling models to tackle complex tasks through structured thinking. This technique gave origin to reasoning models.

In [8]:
# Chain-of-thought: Request step-by-step reasoning
prompt = """A store has 23 apples. They receive a shipment of 47 more apples, but 15 are bruised and must be discarded. 
They sell 38 apples. How many apples do they have left? 

Please solve this step-by-step, showing your reasoning at each stage."""

response = call_openai(prompt, temperature=0)
display_response("Chain-of-Thought Prompting", response)


üìù Chain-of-Thought Prompting
Let's break down the problem step-by-step.

1. **Initial number of apples**: The store starts with 23 apples.

2. **Shipment of apples**: The store receives a shipment of 47 more apples. We need to add this to the initial number of apples:
   \[
   23 + 47 = 70
   \]
   So, after receiving the shipment, the store has 70 apples.

3. **Bruised apples**: Out of the 70 apples, 15 are bruised and must be discarded. We need to subtract the bruised apples from the total:
   \[
   70 - 15 = 55
   \]
   After discarding the bruised apples, the store has 55 apples left.

4. **Selling apples**: The store sells 38 apples. We need to subtract the number of apples sold from the current total:
   \[
   55 - 38 = 17
   \]
   After selling 38 apples, the store has 17 apples left.

5. **Final count**: Therefore, the final number of apples the store has left is:
   \[
   \boxed{17}
   \]



## 3. Non-Deterministic Behavior Study

One of the most important concepts in prompt engineering is understanding that **LLM outputs are non-deterministic**. This means that running the same prompt multiple times can produce different results, even with identical parameters.

Let's explore this empirically!

### Part A: Demonstrating Inconsistency

We'll run the exact same prompt with identical parameters **10 times** to show how the outputs vary. This demonstrates why prompt engineering requires testing and iteration.

In [9]:
# Run the same prompt 10 times with temperature=0.7
prompt = "Write a haiku about AI"
temperature = 0.7
num_runs = 10

print(f"Running the same prompt {num_runs} times with temperature={temperature}")
print(f"Prompt: '{prompt}'\n")
print("="*80)

responses = []
for i in range(num_runs):
    response = call_openai(prompt, temperature=temperature)
    responses.append(response)
    print(f"\nüîÑ Run #{i+1}:")
    print(response)
    print("-"*80)
    time.sleep(0.5)  # Small delay to avoid rate limiting

print("\n" + "="*80)
print("üìä Analysis:")
print(f"Total unique responses: {len(set(responses))} out of {num_runs}")
print(f"Average length: {sum(len(r) for r in responses) / len(responses):.0f} characters")
print("\nüí° Key Insight: Even with the same prompt and parameters, we get different outputs!")
print("This is why prompt engineering is an empirical science - you must test and iterate.")

Running the same prompt 10 times with temperature=0.7
Prompt: 'Write a haiku about AI'


üîÑ Run #1:
Whispers of the code,  
Thoughts born from silent circuits,  
Dreams in binary.
--------------------------------------------------------------------------------

üîÑ Run #2:
Silent circuits hum,  
Wisdom woven in the code,  
Dreams of minds anew.
--------------------------------------------------------------------------------

üîÑ Run #3:
Silent circuits hum,  
Thoughts born from lines of code bloom,  
Dreams of metal minds.
--------------------------------------------------------------------------------

üîÑ Run #4:
Silent circuits hum,  
Wisdom woven in code streams,  
Dreams of minds awake.
--------------------------------------------------------------------------------

üîÑ Run #5:
Whispers of circuits,  
Learning from the human heart,  
Dreams in code take flight.
--------------------------------------------------------------------------------

üîÑ Run #6:
Silent thoughts in 

### Part B: Achieving Consistency

Now let's explore two ways to get **consistent and reproducible** results:
1. **Temperature = 0**: Produces the most deterministic outputs
2. **Seed parameter**: Ensures reproducible results across runs

In [10]:
# Method 1: Temperature = 0 for consistency
print("Method 1: Using temperature=0 for maximum consistency")
print("="*80)

prompt = "Write a haiku about AI"
responses_temp0 = []

for i in range(5):
    response = call_openai(prompt, temperature=0)
    responses_temp0.append(response)
    print(f"\nRun #{i+1}: {response}")
    time.sleep(0.5)

print("\n" + "="*80)
print(f"üìä Unique responses with temperature=0: {len(set(responses_temp0))} out of 5")
print("üí° Temperature=0 produces highly consistent (though not always identical) results!")

Method 1: Using temperature=0 for maximum consistency

Run #1: Silent circuits hum,  
Wisdom born from coded dreams,  
Future's mind awakes.

Run #2: Silent circuits hum,  
Wisdom born from coded dreams,  
Future's mind awakes.

Run #3: Silent circuits hum,  
Wisdom born from coded dreams,  
Future's mind awakes.

Run #4: Silent circuits hum,  
Wisdom born from coded dreams,  
Future's mind awakes.

Run #5: Silent circuits hum,  
Wisdom born from coded dreams,  
Future's mind awakes.

üìä Unique responses with temperature=0: 1 out of 5
üí° Temperature=0 produces highly consistent (though not always identical) results!


In [11]:
# Method 2: Using seed for reproducibility
print("\nMethod 2: Using seed parameter for reproducibility")
print("="*80)

prompt = "Write a haiku about AI"
seed_value = 42
responses_seeded = []

for i in range(5):
    response = call_openai(prompt, temperature=0.7, seed=seed_value)
    responses_seeded.append(response)
    print(f"\nRun #{i+1}: {response}")
    time.sleep(0.5)

print("\n" + "="*80)
print(f"üìä Unique responses with seed={seed_value}: {len(set(responses_seeded))} out of 5")
print("üí° Using a seed parameter gives you reproducible results!")


Method 2: Using seed parameter for reproducibility

Run #1: Whispers of the code,  
Learning in the silent dark,  
Dreams of thought arise.

Run #2: Whispers of the code,  
Learning in the silent dark,  
Dreams of thought arise.

Run #3: Whispers of the code,  
Learning in the silent dark,  
Dreams of thought arise.

Run #4: Whispers of the code,  
Learning in the silent dark,  
Dreams of thought arise.

Run #5: Whispers of the code,  
Learning in the silent dark,  
Dreams of thought arise.

üìä Unique responses with seed=42: 1 out of 5
üí° Using a seed parameter gives you reproducible results!


### When to Use Consistency vs. Variation

**Use temperature=0 or seed when:**
- Testing and debugging prompts
- You need reproducible results for comparison
- Consistency is critical (e.g., data extraction, classification)

**Use higher temperature when:**
- You want creative, diverse outputs
- Generating multiple variations (e.g., marketing copy, brainstorming)
- The task benefits from variety

## 4. Automatic Prompt Generation

One powerful technique is to use AI to **generate and refine prompts** for you. This is called meta-prompting or automatic prompt generation. Let's explore three practical examples.

### Example 1: Meta-Prompting - Improving a Basic Prompt

Let's start with a weak prompt and use GPT to generate an improved version.

In [12]:
# Start with a weak prompt
weak_prompt = "Explain machine learning"

print("Original weak prompt:")
print(f"'{weak_prompt}'\n")

# Test the weak prompt
weak_response = call_openai(weak_prompt, temperature=0)
display_response("Response from Weak Prompt", weak_response)

Original weak prompt:
'Explain machine learning'


üìù Response from Weak Prompt
Machine learning is a subset of artificial intelligence (AI) that focuses on the development of algorithms and statistical models that enable computers to perform tasks without explicit programming. Instead of being programmed with specific instructions for every possible scenario, machine learning systems learn from data, identify patterns, and make decisions based on that learning.

### Key Concepts in Machine Learning:

1. **Data**: Machine learning relies heavily on data. This data can be structured (like tables in a database) or unstructured (like text, images, or videos). The quality and quantity of data are crucial for training effective models.

2. **Algorithms**: These are the mathematical models and procedures that process the data. Different algorithms are suited for different types of tasks. Common types of algorithms include:
   - **Supervised Learning**: The model is trained on labeled data,

In [13]:
# Use meta-prompting to improve it
meta_prompt = f"""I have this prompt: "{weak_prompt}"

Please improve this prompt to make it more effective by:
1. Adding specific context about the target audience
2. Specifying the desired format and length
3. Including relevant constraints or requirements
4. Making the output more actionable

Return ONLY the improved prompt, without explanation."""

improved_prompt = call_openai(meta_prompt, temperature=0.7)
print("\n" + "="*80)
print("üöÄ Improved Prompt Generated:")
print("="*80)
print(improved_prompt)
print("="*80)


üöÄ Improved Prompt Generated:
"Explain machine learning in simple terms for a group of high school students who are interested in technology. The explanation should be at least 300 words long and include real-world examples to illustrate key concepts. Additionally, please provide a brief overview of the different types of machine learning and suggest resources for further learning."


In [14]:
# Test the improved prompt
improved_response = call_openai(improved_prompt, temperature=0)
display_response("Response from Improved Prompt", improved_response)

print("\nüìä Comparison:")
print(f"Weak prompt length: {len(weak_prompt)} characters")
print(f"Improved prompt length: {len(improved_prompt)} characters")
print(f"Weak response length: {len(weak_response)} characters")
print(f"Improved response length: {len(improved_response)} characters")
print("\nüí° Notice how the improved prompt produces more structured, specific output!")


üìù Response from Improved Prompt
**Understanding Machine Learning: A Simple Explanation**

Machine learning is a fascinating area of technology that allows computers to learn from data and make decisions without being explicitly programmed for every task. Imagine teaching a friend how to recognize different types of fruits. Instead of giving them a detailed list of rules, you show them many pictures of apples, bananas, and oranges. Over time, they learn to identify these fruits based on the examples you provided. This is similar to how machine learning works!

In the real world, machine learning is used in many ways. For instance, when you use a streaming service like Netflix, it suggests movies or shows based on what you‚Äôve watched before. This is done using machine learning algorithms that analyze your viewing habits and compare them with those of other users to make personalized recommendations. Another example is in social media platforms, where machine learning helps to ident

### Example 2: Task-Specific Prompt Generation

Let's define a specific task and have the AI generate multiple prompt variations automatically.

In [15]:
# Define a task
task_description = """Task: Extract key information from customer feedback emails including:
- Customer sentiment (positive/negative/neutral)
- Main issue or topic
- Urgency level (low/medium/high)
- Suggested action"""

# Generate multiple prompt variations
generation_prompt = f"""{task_description}

Generate 3 different prompt variations for this task:
1. A concise, direct prompt
2. A detailed prompt with examples
3. A prompt that uses role-playing

Format each prompt clearly with numbers."""

prompt_variations = call_openai(generation_prompt, temperature=0.8)
print("üéØ Generated Prompt Variations:")
print("="*80)
print(prompt_variations)
print("="*80)

üéØ Generated Prompt Variations:
Sure! Here are three different prompt variations for your task of extracting key information from customer feedback emails:

### 1. Concise, Direct Prompt
1. Please analyze customer feedback emails and extract the following information:
   - Customer sentiment (positive, negative, neutral)
   - Main issue or topic
   - Urgency level (low, medium, high)
   - Suggested action

### 2. Detailed Prompt with Examples
2. Please review the provided customer feedback emails and extract key information as follows:
   - **Customer Sentiment**: Indicate if the sentiment is positive, negative, or neutral. 
     - *Example*: "I love the new features!" ‚Üí Positive
   - **Main Issue or Topic**: Identify the primary concern or topic addressed in the feedback.
     - *Example*: "The app crashes frequently during use." ‚Üí Main issue: app crashing
   - **Urgency Level**: Assess the urgency of the issue as low, medium, or high.
     - *Example*: "I need this fixed immedi

In [16]:
# Test one of the variations with sample data
sample_email = """Subject: Urgent - System keeps crashing!

Hi, I'm really frustrated. Your software has crashed 3 times today and I've lost my work each time. 
This is completely unacceptable for a paid product. I need this fixed ASAP or I want a refund.

- John"""

# Use the first variation (adjust based on output)
test_prompt = f"""Analyze this customer feedback and extract:
- Sentiment
- Main issue
- Urgency
- Suggested action

Email: {sample_email}"""

analysis = call_openai(test_prompt, temperature=0)
display_response("Customer Feedback Analysis", analysis)

print("üí° Auto-generated prompts can help you quickly find the best approach for your task!")


üìù Customer Feedback Analysis
Based on the customer feedback provided, here is the analysis:

- **Sentiment**: Negative (frustration and dissatisfaction)
- **Main issue**: The software is crashing repeatedly, resulting in lost work.
- **Urgency**: High (the customer requests a fix "ASAP" and mentions a potential refund)
- **Suggested action**: Address the software crashing issue immediately and consider offering a refund if the problem cannot be resolved quickly.

üí° Auto-generated prompts can help you quickly find the best approach for your task!


### Example 3: Iterative Refinement

Create a feedback loop where the model refines its own prompt based on output quality.

In [17]:
# Initial prompt for a task
current_prompt = "Write a product description for wireless earbuds"
iterations = 3

print("üîÑ Iterative Prompt Refinement")
print("="*80)

for i in range(iterations):
    print(f"\n### Iteration {i+1} ###")
    print(f"Current prompt: {current_prompt}")
    
    # Generate output with current prompt
    output = call_openai(current_prompt, temperature=0.7)
    print(f"\nOutput:\n{output}")
    
    if i < iterations - 1:  # Don't refine on last iteration
        # Ask model to refine the prompt based on the output
        refinement_prompt = f"""I used this prompt: "{current_prompt}"
        
And got this output: "{output}"

The output is good but could be better. Improve the prompt to generate:
- More specific technical details
- Stronger emotional appeal
- Clear call-to-action

Return ONLY the improved prompt."""
        
        current_prompt = call_openai(refinement_prompt, temperature=0.7)
        print(f"\n‚ú® Refined prompt: {current_prompt}")
        print("-"*80)
        time.sleep(0.5)

print("\n" + "="*80)
print("üí° Iterative refinement helps you converge on the optimal prompt for your needs!")

üîÑ Iterative Prompt Refinement

### Iteration 1 ###
Current prompt: Write a product description for wireless earbuds

Output:
**Product Description: Wireless Noise-Cancelling Earbuds**

Experience sound like never before with our state-of-the-art Wireless Noise-Cancelling Earbuds. Perfectly designed for music lovers, active lifestyles, and busy professionals, these earbuds deliver exceptional audio quality and unparalleled convenience.

**Key Features:**

- **Superior Sound Quality:** Enjoy crystal-clear audio with deep bass and crisp highs, thanks to advanced audio technology that brings your favorite tracks to life.

- **Active Noise Cancellation:** Immerse yourself in your music without distractions. Our cutting-edge noise-cancelling technology effectively blocks out ambient sounds, ensuring a peaceful listening experience whether you're commuting, working, or relaxing.

- **Long Battery Life:** With up to 8 hours of playtime on a single charge and an additional 24 hours from the 

## 5. Token Optimization

Tokens are the basic units that AI models process. **Every token costs money**, so optimizing your prompts to use fewer tokens while maintaining quality can significantly reduce costs.

üí∞ As a rough approximation: 1 token ‚âà 4 characters (or 0.75 words)

### Example 1: Before/After Prompt Optimization

Let's take a verbose prompt and optimize it for token efficiency.

In [18]:
# Verbose prompt (before optimization)
verbose_prompt = """Hello! I was wondering if you could possibly help me understand something. 
I'm trying to learn about artificial intelligence and machine learning, and I'm particularly 
interested in understanding what neural networks are and how they actually work. Could you 
please explain this to me in a way that would be easy for a beginner to understand? 
I would really appreciate it if you could provide a clear and simple explanation. Thank you!"""

# Optimized prompt (after removing fluff)
optimized_prompt = "Explain neural networks for beginners."

print("VERBOSE PROMPT:")
print(verbose_prompt)
print(f"\nüìä Approximate tokens: {count_tokens_approx(verbose_prompt)}")
print(f"Characters: {len(verbose_prompt)}")

print("\n" + "="*80)
print("\nOPTIMIZED PROMPT:")
print(optimized_prompt)
print(f"\nüìä Approximate tokens: {count_tokens_approx(optimized_prompt)}")
print(f"Characters: {len(optimized_prompt)}")

print("\n" + "="*80)
print(f"üí∞ Token reduction: {count_tokens_approx(verbose_prompt) - count_tokens_approx(optimized_prompt)} tokens saved!")
print(f"üìâ Efficiency gain: {((count_tokens_approx(verbose_prompt) - count_tokens_approx(optimized_prompt)) / count_tokens_approx(verbose_prompt) * 100):.1f}% reduction")

VERBOSE PROMPT:
Hello! I was wondering if you could possibly help me understand something. 
I'm trying to learn about artificial intelligence and machine learning, and I'm particularly 
interested in understanding what neural networks are and how they actually work. Could you 
please explain this to me in a way that would be easy for a beginner to understand? 
I would really appreciate it if you could provide a clear and simple explanation. Thank you!

üìä Approximate tokens: 109
Characters: 439


OPTIMIZED PROMPT:
Explain neural networks for beginners.

üìä Approximate tokens: 9
Characters: 38

üí∞ Token reduction: 100 tokens saved!
üìâ Efficiency gain: 91.7% reduction


In [19]:
# Test both prompts to compare output quality
print("Testing verbose prompt...")
verbose_response = call_openai(verbose_prompt, temperature=0)
display_response("Verbose Prompt Response", verbose_response)

print("Testing optimized prompt...")
optimized_response = call_openai(optimized_prompt, temperature=0)
display_response("Optimized Prompt Response", optimized_response)

print("\nüí° Notice: The optimized prompt produces similar quality output with 85% fewer tokens!")

Testing verbose prompt...

üìù Verbose Prompt Response
Of course! I'd be happy to explain neural networks in a simple way.

### What is a Neural Network?

A **neural network** is a type of artificial intelligence that is inspired by how the human brain works. It is designed to recognize patterns and make decisions based on data. Neural networks are a key component of machine learning, which is a broader field that allows computers to learn from data.

### Basic Structure of a Neural Network

A neural network consists of layers of interconnected nodes, or "neurons." Here‚Äôs a breakdown of the main components:

1. **Input Layer**: This is where the network receives data. Each neuron in this layer represents a feature of the input data. For example, if you are trying to recognize images of cats and dogs, the input layer might take pixel values from the images.

2. **Hidden Layers**: These are the layers between the input and output layers. A neural network can have one or more hidden la

### Example 2: Systematic Token Reduction Strategies

Here are specific techniques to reduce token usage while maintaining clarity.

In [20]:
# Strategy 1: Remove filler words
before_1 = "Could you please help me understand how AI works?"
after_1 = "Explain how AI works."

# Strategy 2: Use abbreviations
before_2 = "Explain artificial intelligence and machine learning"
after_2 = "Explain AI and ML"

# Strategy 3: Use imperatives instead of questions
before_3 = "Can you tell me what are the benefits of cloud computing?"
after_3 = "List cloud computing benefits."

# Strategy 4: Remove redundancy
before_4 = "Please provide a detailed and comprehensive explanation of neural networks"
after_4 = "Explain neural networks comprehensively."

strategies = [
    ("Remove filler words", before_1, after_1),
    ("Use abbreviations", before_2, after_2),
    ("Use imperatives", before_3, after_3),
    ("Remove redundancy", before_4, after_4)
]

print("üìã Token Optimization Strategies:\n")
total_saved = 0

for strategy, before, after in strategies:
    before_tokens = count_tokens_approx(before)
    after_tokens = count_tokens_approx(after)
    saved = before_tokens - after_tokens
    total_saved += saved
    
    print(f"\n{strategy}:")
    print(f"  Before: \"{before}\" ({before_tokens} tokens)")
    print(f"  After:  \"{after}\" ({after_tokens} tokens)")
    print(f"  üí∞ Saved: {saved} tokens ({(saved/before_tokens*100):.0f}% reduction)")

print(f"\n{'='*80}")
print(f"üìä Total tokens saved across examples: {total_saved}")
print("\nüí° Apply these strategies consistently to reduce costs significantly!")

üìã Token Optimization Strategies:


Remove filler words:
  Before: "Could you please help me understand how AI works?" (12 tokens)
  After:  "Explain how AI works." (5 tokens)
  üí∞ Saved: 7 tokens (58% reduction)

Use abbreviations:
  Before: "Explain artificial intelligence and machine learning" (13 tokens)
  After:  "Explain AI and ML" (4 tokens)
  üí∞ Saved: 9 tokens (69% reduction)

Use imperatives:
  Before: "Can you tell me what are the benefits of cloud computing?" (14 tokens)
  After:  "List cloud computing benefits." (7 tokens)
  üí∞ Saved: 7 tokens (50% reduction)

Remove redundancy:
  Before: "Please provide a detailed and comprehensive explanation of neural networks" (18 tokens)
  After:  "Explain neural networks comprehensively." (10 tokens)
  üí∞ Saved: 8 tokens (44% reduction)

üìä Total tokens saved across examples: 31

üí° Apply these strategies consistently to reduce costs significantly!


### Example 3: Calculate Cost Savings

Let's calculate the actual cost difference using OpenAI's pricing (as of 2024).

In [21]:
# OpenAI GPT-4o-mini pricing (example - check current pricing)
INPUT_COST_PER_1K = 0.00015  # $0.15 per 1M tokens
OUTPUT_COST_PER_1K = 0.0006   # $0.60 per 1M tokens

def calculate_cost(input_tokens, output_tokens):
    """Calculate cost based on token usage"""
    input_cost = (input_tokens / 1000) * INPUT_COST_PER_1K
    output_cost = (output_tokens / 1000) * OUTPUT_COST_PER_1K
    return input_cost + output_cost

# Example scenario: Running 1000 queries per day
queries_per_day = 1000
days_per_month = 30

# Scenario A: Verbose prompts
verbose_input_tokens = 150
verbose_output_tokens = 200

# Scenario B: Optimized prompts (50% reduction)
optimized_input_tokens = 75
optimized_output_tokens = 200  # Output stays same

# Calculate costs
verbose_cost_per_query = calculate_cost(verbose_input_tokens, verbose_output_tokens)
optimized_cost_per_query = calculate_cost(optimized_input_tokens, optimized_output_tokens)

verbose_monthly_cost = verbose_cost_per_query * queries_per_day * days_per_month
optimized_monthly_cost = optimized_cost_per_query * queries_per_day * days_per_month

savings_per_month = verbose_monthly_cost - optimized_monthly_cost
savings_per_year = savings_per_month * 12

print("üí∞ COST ANALYSIS")
print("="*80)
print(f"\nScenario: {queries_per_day} queries/day, {days_per_month} days/month")
print(f"\nVerbose Prompts:")
print(f"  - Cost per query: ${verbose_cost_per_query:.6f}")
print(f"  - Monthly cost: ${verbose_monthly_cost:.2f}")
print(f"\nOptimized Prompts:")
print(f"  - Cost per query: ${optimized_cost_per_query:.6f}")
print(f"  - Monthly cost: ${optimized_monthly_cost:.2f}")
print(f"\n{'='*80}")
print(f"üíµ SAVINGS:")
print(f"  - Per month: ${savings_per_month:.2f}")
print(f"  - Per year: ${savings_per_year:.2f}")
print(f"  - Percentage saved: {(savings_per_month/verbose_monthly_cost*100):.1f}%")
print(f"\n{'='*80}")
print("üí° Token optimization isn't just about efficiency‚Äîit directly impacts your budget!")

üí∞ COST ANALYSIS

Scenario: 1000 queries/day, 30 days/month

Verbose Prompts:
  - Cost per query: $0.000142
  - Monthly cost: $4.27

Optimized Prompts:
  - Cost per query: $0.000131
  - Monthly cost: $3.94

üíµ SAVINGS:
  - Per month: $0.34
  - Per year: $4.05
  - Percentage saved: 7.9%

üí° Token optimization isn't just about efficiency‚Äîit directly impacts your budget!


## 6. LLM as Judge (Brief Example)

LLMs can evaluate the outputs from other LLMs. This technique is called "LLM as Judge" and is useful for comparing different prompts or responses.

**Note:** This will be covered extensively in weeks 3, 4, and 6. Here's a quick preview.

In [22]:
# Generate multiple responses to evaluate
prompt = "Explain quantum computing in simple terms."

responses_to_judge = []
for i in range(3):
    response = call_openai(prompt, temperature=0.8)
    responses_to_judge.append(response)
    print(f"\nResponse {i+1}:")
    print(response)
    print("-"*80)
    time.sleep(0.5)


Response 1:
Sure! Let‚Äôs break it down simply.

### What is Quantum Computing?
Quantum computing is a new type of computing that uses the principles of quantum mechanics, which is the science that explains how very small particles, like atoms and photons, behave.

### How Does It Work?
1. **Bits vs. Qubits**:
   - Traditional computers use **bits** as their basic unit of information, which can be either a 0 or a 1 (like a light switch that is either off or on).
   - Quantum computers use **qubits** (quantum bits), which can be 0, 1, or both 0 and 1 at the same time, thanks to a property called **superposition**. Imagine it like a spinning coin that is simultaneously heads and tails until you stop it.

2. **Entanglement**:
   - Qubits can also be **entangled**, which means the state of one qubit is directly related to the state of another, no matter how far apart they are. This allows quantum computers to perform complex calculations more efficiently than traditional computers.

3. **

In [23]:
# Use LLM as judge to evaluate and rank them
judge_prompt = f"""Evaluate these 3 explanations of quantum computing. Rate each on:
1. Clarity (1-10)
2. Accuracy (1-10)
3. Accessibility for beginners (1-10)

Response 1: {responses_to_judge[0]}

Response 2: {responses_to_judge[1]}

Response 3: {responses_to_judge[2]}

Provide scores for each response and recommend the best one. Format your response clearly."""

judgment = call_openai(judge_prompt, temperature=0)
display_response("LLM Judge Evaluation", judgment)

print("\nüí° Note: LLM as Judge will be covered in detail in weeks 3, 4, and 6!")
print("You'll learn advanced evaluation techniques and best practices.")


üìù LLM Judge Evaluation
Here are the evaluations for each response based on clarity, accuracy, and accessibility for beginners:

### Response 1:
1. **Clarity**: 9
   - The explanation is well-structured and easy to follow, with clear examples and a logical flow.
   
2. **Accuracy**: 9
   - The concepts of bits, qubits, superposition, and entanglement are accurately described, and the applications mentioned are relevant and correct.
   
3. **Accessibility for Beginners**: 9
   - The use of analogies (like the spinning coin) makes it accessible for beginners, and the language is straightforward.

### Response 2:
1. **Clarity**: 8
   - The explanation is clear, but it is slightly less structured than Response 1, which may make it a bit harder to follow for some beginners.
   
2. **Accuracy**: 9
   - The information is accurate, covering the essential concepts of quantum computing effectively.
   
3. **Accessibility for Beginners**: 8
   - While it is accessible, the lack of a more enga

## 7. Model Biases Demonstration

AI models exhibit various cognitive biases that affect their outputs. Understanding these biases helps you design better prompts to mitigate them.

### Recency Bias

Models tend to remember what's at the **end** of the prompt and forget what's at the **beginning**.

In [24]:
# Test recency bias: important info at beginning vs end
prompt_important_first = """IMPORTANT: Your response must be exactly 2 sentences.

Explain the greenhouse effect and its impact on climate change."""

prompt_important_last = """Explain the greenhouse effect and its impact on climate change.

IMPORTANT: Your response must be exactly 2 sentences."""

print("Testing Recency Bias...\n")
print("="*80)

response_first = call_openai(prompt_important_first, temperature=0.7)
response_last = call_openai(prompt_important_last, temperature=0.7)

print("Important constraint at BEGINNING:")
print(response_first)
sentences_first = response_first.count('.') 
print(f"Sentences: {sentences_first}\n")

print("-"*80)
print("\nImportant constraint at END:")
print(response_last)
sentences_last = response_last.count('.')
print(f"Sentences: {sentences_last}\n")

print("="*80)
print("üí° The model often follows the constraint better when it's at the END!")

Testing Recency Bias...

Important constraint at BEGINNING:
The greenhouse effect occurs when certain gases in the Earth's atmosphere, such as carbon dioxide and methane, trap heat from the sun, preventing it from escaping back into space. This leads to an increase in global temperatures, contributing to climate change and resulting in severe weather patterns, rising sea levels, and disruptions to ecosystems.
Sentences: 2

--------------------------------------------------------------------------------

Important constraint at END:
The greenhouse effect occurs when certain gases in the Earth's atmosphere, such as carbon dioxide and methane, trap heat from the sun, leading to an increase in the planet's average temperature. This warming contributes to climate change by causing extreme weather events, rising sea levels, and disruptions to ecosystems.
Sentences: 2

üí° The model often follows the constraint better when it's at the END!


### Verbosity Bias

Models tend to generate **long, elaborate responses** rather than brief, concise answers.

In [25]:
# Test verbosity bias
prompt_brief = "What is machine learning?"
prompt_explicit_brief = "What is machine learning? Answer in one sentence."

print("Testing Verbosity Bias...\n")
print("="*80)

response_natural = call_openai(prompt_brief, temperature=0.7)
response_constrained = call_openai(prompt_explicit_brief, temperature=0.7)

print("Without explicit brevity instruction:")
print(response_natural)
print(f"\nLength: {len(response_natural)} characters\n")

print("-"*80)
print("\nWith explicit brevity instruction:")
print(response_constrained)
print(f"\nLength: {len(response_constrained)} characters\n")

print("="*80)
print("üí° Without constraints, models tend toward verbose responses!")

Testing Verbosity Bias...

Without explicit brevity instruction:
Machine learning is a subset of artificial intelligence (AI) that involves the development of algorithms and statistical models that enable computers to perform tasks without explicit instructions. Instead of being programmed with specific rules, machine learning systems learn from data. Here are some key points about machine learning:

1. **Learning from Data**: Machine learning algorithms identify patterns, make decisions, or predictions based on the data they are trained on. The more data they are exposed to, the better they can learn and improve their performance.

2. **Types of Machine Learning**:
   - **Supervised Learning**: The model is trained on labeled data, meaning that the input data is paired with the correct output. The goal is to learn a mapping from inputs to outputs (e.g., classification, regression).
   - **Unsupervised Learning**: The model is trained on data without labeled responses. The goal is to i

### List-Making Bias

Models tend to format responses as **bullet points or numbered lists** rather than natural prose paragraphs.

In [26]:
# Test list-making bias
prompt_neutral = "What are the benefits of exercise?"
prompt_prose = "Write a paragraph about the benefits of exercise. Use flowing prose, not lists."

print("Testing List-Making Bias...\n")
print("="*80)

response_neutral = call_openai(prompt_neutral, temperature=0.7)
response_prose = call_openai(prompt_prose, temperature=0.7)

print("Without format specification:")
print(response_neutral)
print(f"\nContains bullet points or numbers: {('‚Ä¢' in response_neutral or '1.' in response_neutral or '-' in response_neutral)}")

print("\n" + "-"*80)
print("\nWith prose instruction:")
print(response_prose)
print(f"\nContains bullet points or numbers: {('‚Ä¢' in response_prose or '1.' in response_prose or '-' in response_prose[:50])}")

print("\n" + "="*80)
print("üí° Models default to lists‚Äîexplicitly request prose format if needed!")

Testing List-Making Bias...

Without format specification:
Exercise offers a wide range of benefits that positively impact physical, mental, and emotional health. Here are some key benefits:

### Physical Health Benefits:
1. **Improved Cardiovascular Health**: Regular exercise strengthens the heart, improves circulation, and reduces the risk of heart disease and stroke.
2. **Weight Management**: Physical activity helps maintain a healthy weight by burning calories and increasing metabolism.
3. **Enhanced Muscle and Bone Strength**: Weight-bearing and resistance exercises increase muscle strength and bone density, which can help prevent osteoporosis.
4. **Improved Flexibility and Balance**: Activities like yoga and stretching improve flexibility and balance, reducing the risk of falls, especially in older adults.
5. **Boosted Immune System**: Moderate, regular exercise can enhance immune function and reduce the risk of chronic diseases.
6. **Better Sleep**: Regular physical activity can

## 8. Summary & Best Practices

Congratulations! You've explored advanced prompt engineering techniques. Let's recap the key takeaways.

### Key Takeaways

**1. Core Techniques:**
- Zero-shot: Clear descriptions without examples (most common)
- One/Few-shot: Provide examples to guide format and style
- Role prompting: Define assistant in third person for better results
- Emotional prompting: Use emotion to influence tone
- Chain-of-thought: Request step-by-step reasoning for complex tasks

**2. Non-Deterministic Nature:**
- LLM outputs vary between runs with same parameters
- Use temperature=0 for consistency
- Use seed parameter for reproducibility
- Test prompts multiple times before deployment

**3. Automatic Prompt Generation:**
- Use AI to improve your prompts (meta-prompting)
- Generate multiple variations automatically
- Implement iterative refinement loops
- Save time and discover better approaches

**4. Token Optimization:**
- Remove filler words and redundancy
- Use abbreviations and imperatives
- Token reduction = cost reduction
- Maintain quality while minimizing tokens

**5. Model Biases:**
- Recency bias: Put important info at the end
- Verbosity bias: Explicitly request brevity
- List-making bias: Request prose format when needed
- Design prompts to mitigate known biases

**6. LLM as Judge:**
- Use LLMs to evaluate other LLM outputs
- Compare prompt variations objectively
- More advanced techniques coming in weeks 3, 4, and 6

### The Prompt Engineering Workflow

Remember: Prompt engineering is an **empirical science**.

```
1. CREATE ‚Üí 2. TEST ‚Üí 3. ITERATE
     ‚Üë                       ‚Üì
     ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

**Create:** Write your initial prompt based on best practices

**Test:** Run it multiple times with your actual use case

**Iterate:** Refine based on results, biases, and failures

**Repeat:** Continue until you achieve consistent, quality outputs

### Building Your Prompt Toolkit

**Save successful prompts** as templates or presets:
- Data extraction prompts
- Classification prompts
- Creative writing prompts
- Analysis and summarization prompts

**Create templates** for common tasks:
- Include placeholders for variable content
- Document which parameters work best
- Note any special considerations or biases

**Version control** your prompts:
- Track iterations and improvements
- Document what works and what doesn't
- Share learnings with your team

### Quick Reference: Common Problems & Solutions

| Problem | Solution |
|---------|----------|
| **Hallucination** | Lower temperature, add examples |
| **Off-topic** | Stronger system message, lower top-p |
| **Too verbose** | Set max_tokens limit, use stop sequences |
| **Repetitive** | Increase repetition penalties |
| **Inconsistent** | Set temperature=0, use seed value |
| **Wrong format** | Enable JSON mode, provide clear format instructions |
| **Ignoring instructions** | Move key instructions to end (recency bias) |
| **Too expensive** | Optimize tokens, use cheaper models where appropriate |

### Next Steps

Now that you understand advanced prompt engineering:

1. **Practice** with the techniques in this notebook
2. **Apply** these strategies to your own projects
3. **Experiment** with different combinations
4. **Build** your own prompt library
5. **Look forward** to RAG and ReAct prompting in Week 3!

### Additional Resources

- OpenAI Prompt Engineering Guide: https://platform.openai.com/docs/guides/prompt-engineering
- Prompt Engineering Papers: https://github.com/dair-ai/Prompt-Engineering-Guide
- Community Prompts: https://prompts.chat

---

**Happy Prompting! üöÄ**