# Lecture 4: Advanced Prompt Engineering

## üéØ Master the Art of Communicating with AI Models

Welcome to **Advanced Prompt Engineering**! In this lecture, you'll learn how to unlock the full potential of language models through effective prompting techniques. Just like learning a new language, mastering prompt engineering is about understanding how to communicate clearly and effectively with AI systems.

---

## üìö Learning Objectives

By the end of this lecture, you will be able to:

- **Control model behavior** using inference parameters (temperature)
- **Improve reasoning accuracy** with Chain-of-Thought (CoT) prompting
- **Customize model responses** using system prompts and personas
- **Apply best practices** for effective prompt engineering in real-world scenarios

---

## üõ†Ô∏è Technical Setup

**Model:** Llama 3.2 3B (~2GB) via Ollama  
**Why this model?** A capable small model that demonstrates how proper prompting techniques can dramatically improve results, especially for reasoning tasks like math calculations. Perfect for learning and experimentation!

---

## üí° Why Prompt Engineering Matters

Prompt engineering is the difference between getting mediocre results and unlocking exceptional AI performance. The same model can produce vastly different outputs based on how you ask the question. In this demo, you'll see firsthand how small changes in prompts can lead to significant improvements in accuracy, relevance, and usefulness.


---

## 1. üöÄ Setup & Initialization

Let's get everything set up to start working with our language model. We'll use **Ollama**, a powerful tool that allows us to run large language models locally.


In [None]:
# Step 1: Install Ollama
# Ollama allows us to run LLMs locally without needing cloud APIs
!curl -fsSL https://ollama.com/install.sh | sh


In [None]:
# Step 2: Install Python wrapper for Ollama
# This gives us a Python interface to interact with Ollama
!pip install ollama


In [None]:
# Step 3: Start Ollama server in the background
# CRITICAL: Using subprocess.Popen to run ollama serve as a background process
# This prevents the notebook from blocking while the server runs
import subprocess
import time

# Start ollama serve in the background
ollama_process = subprocess.Popen(
    ['ollama', 'serve'],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

# Wait for the server to start
print("‚è≥ Starting Ollama server...")
time.sleep(5)
print("‚úÖ Ollama server is running in the background")
print("   Ready to load models!")


In [None]:
# Step 4: Download the Llama 3.2 3B model
# This will download ~2GB - may take a few minutes depending on your connection
print("üì• Downloading Llama 3.2 3B model...")
print("   This is a one-time download. The model will be cached locally.")
!ollama pull llama3.2:3b
print("‚úÖ Model downloaded and ready to use!")


---

## 2. üîß Helper Function

Let's create a reusable function to interact with our model. This will make our code cleaner and easier to experiment with different prompts and parameters.


In [None]:
import ollama

def query_model(prompt, temperature=0.7, system_prompt=""):
    """
    Query Llama 3.2 3B model via Ollama with customizable parameters.
    
    This function provides a clean interface to interact with our language model,
    allowing us to easily experiment with different prompts and settings.
    
    Args:
        prompt (str): The user prompt/question you want to ask the model
        temperature (float): Controls randomness and creativity
            - 0.0-0.3: More deterministic, focused, consistent (good for factual tasks)
            - 0.7-1.0: Balanced creativity (good for general tasks)
            - 1.0+: More creative, diverse, unpredictable (good for creative writing)
        system_prompt (str): Optional system prompt to set model behavior/persona
            - Use this to control the model's role, tone, or expertise level
    
    Returns:
        str: The model's response text
    """
    # Prepare the message structure
    messages = []
    
    # Add system prompt if provided (sets the model's behavior/role)
    if system_prompt:
        messages.append({
            "role": "system",
            "content": system_prompt
        })
    
    # Add the user's prompt/question
    messages.append({
        "role": "user",
        "content": prompt
    })
    
    # Query the model via Ollama
    response = ollama.chat(
        model='llama3.2:3b',
        messages=messages,
        options={
            'temperature': temperature
        }
    )
    
    return response['message']['content']

print("‚úÖ Helper function 'query_model' created successfully!")
print("\nüìñ Usage examples:")
print("   - Basic: query_model('What is AI?')")
print("   - With temperature: query_model('Write a story', temperature=1.0)")
print("   - With system prompt: query_model('Explain quantum physics', system_prompt='You are a teacher')")


---

## 3. üå°Ô∏è Demo 1: Controlling Creativity with Temperature

### What is Temperature?

**Temperature** is one of the most important inference parameters. It controls the randomness and creativity of model outputs by adjusting how the model samples from its probability distribution.

### Temperature Ranges

| Temperature | Characteristics | Best For |
|------------|----------------|----------|
| **0.0 - 0.3** | Deterministic, focused, consistent | Factual Q&A, code generation, math problems |
| **0.4 - 0.7** | Balanced, natural, slightly varied | General conversations, explanations |
| **0.8 - 1.2** | Creative, diverse, unpredictable | Creative writing, brainstorming, storytelling |
| **1.3+** | Highly random, experimental | Experimental use cases only |

### üéØ Demo Task

Let's see how temperature affects the same prompt. We'll ask the model to write a poem with different temperature settings and observe the differences in creativity and variation.


In [None]:
# Demo 1: Temperature Effects - Low Temperature (Deterministic)
prompt = "Write a 4-line poem about a lonely robot on Mars."

print("=" * 80)
print("üå°Ô∏è TEMPERATURE: 0.1 (Low - Deterministic & Focused)")
print("=" * 80)
print(f"üìù Prompt: {prompt}\n")
print("ü§ñ Model Response:")
print("-" * 80)
output_low = query_model(prompt, temperature=0.1)
print(output_low)
print("-" * 80)
print("\nüí≠ Observation: Notice how the model is more focused and consistent.")
print("   Low temperature produces more predictable, factual outputs.\n")


In [None]:
print("=" * 80)
print("üå°Ô∏è TEMPERATURE: 1.0 (High - Creative & Varied)")
print("=" * 80)
print(f"üìù Prompt: {prompt}\n")
print("ü§ñ Model Response:")
print("-" * 80)
output_high = query_model(prompt, temperature=1.0)
print(output_high)
print("-" * 80)
print("\nüí≠ Observation: Notice how the model is more creative and varied.")
print("   High temperature produces more diverse, unexpected outputs.\n")

print("=" * 80)
print("üìä KEY TAKEAWAY")
print("=" * 80)
print("üí° The SAME model with the SAME prompt produces DIFFERENT outputs!")
print("   Temperature is a powerful tool for controlling model behavior.")
print("   - Use LOW temperature for factual, consistent tasks")
print("   - Use HIGH temperature for creative, varied tasks")
print("=" * 80)


---

## 4. üß† Demo 2: Chain-of-Thought (CoT) Prompting

### What is Chain-of-Thought?

**Chain-of-Thought (CoT)** prompting is a technique that encourages the model to "think step by step" before providing a final answer. Instead of jumping directly to a conclusion, the model breaks down the problem into intermediate reasoning steps.

### Why CoT Works

1. **Mimics Human Reasoning**: Just like humans, models perform better when they work through problems systematically
2. **Reduces Errors**: Breaking down complex problems helps catch mistakes early
3. **Improves Accuracy**: Especially powerful for math, logic, and multi-step reasoning tasks
4. **Makes Reasoning Transparent**: You can see how the model arrived at its answer

### üéØ Demo Task

Let's solve a multiplication problem (`123 √ó 76`) in two ways:
1. **Baseline**: Direct question without step-by-step guidance
2. **With CoT**: Explicitly asking for step-by-step reasoning

**Expected Answer:** 9,348

Watch how CoT prompting improves accuracy!


In [None]:
# Demo 2: Chain-of-Thought - Baseline (No CoT)
prompt_baseline = "What is 123 * 76? Answer with just the number."

print("=" * 80)
print("‚ùå BASELINE: Direct Question (No Chain-of-Thought)")
print("=" * 80)
print(f"üìù Prompt: {prompt_baseline}\n")
print("ü§ñ Model Response:")
print("-" * 80)
result_baseline = query_model(prompt_baseline, temperature=0.1)
print(result_baseline)
print("-" * 80)
print(f"\n‚úÖ Expected Answer: 9,348")
print(f"üìä Model Answer: {result_baseline.strip()}")
print("\nüí≠ Observation: Without step-by-step guidance, the model may struggle")
print("   with complex calculations or make errors.\n")


In [None]:
# Demo 2: Chain-of-Thought - With Step-by-Step Reasoning
prompt_cot = """What is 123 * 76? Show me step by step. 
Multiply 100*76, then 20*76, then 3*76, and add them up."""

print("=" * 80)
print("‚úÖ CHAIN-OF-THOUGHT: Step-by-Step Reasoning")
print("=" * 80)
print(f"üìù Prompt: {prompt_cot}\n")
print("ü§ñ Model Response:")
print("-" * 80)
result_cot = query_model(prompt_cot, temperature=0.1)
print(result_cot)
print("-" * 80)
print(f"\n‚úÖ Expected Answer: 9,348")
print("\nüí° KEY TAKEAWAY:")
print("   Notice how the step-by-step approach helps the model:")
print("   1. Break down the problem into manageable parts")
print("   2. Show its reasoning process (transparency!)")
print("   3. Achieve higher accuracy on complex tasks")
print("\nüéØ Best Practice: Use CoT for any reasoning task!")
print("   - Math problems")
print("   - Logic puzzles")
print("   - Multi-step problem solving")
print("   - Any task requiring careful reasoning")
print("=" * 80)


---

## 5. üé≠ Interactive Demo: The "Persona" Challenge

### Understanding System Prompts

**System prompts** are special instructions that set the model's behavior, tone, and expertise level *before* the conversation begins. Think of them as setting the "role" or "personality" of the AI.

### Why System Prompts Matter

System prompts are incredibly powerful for:
- **üéØ Audience Adaptation**: Explain the same concept to a 5-year-old vs. a PhD student
- **üé≠ Role-Playing**: Transform the model into a specific character or expert
- **üìù Style Control**: Set the tone (formal, casual, technical, friendly)
- **üîß Behavior Shaping**: Define how the model should respond in different contexts

### üéØ Your Challenge

We'll ask the model to explain **Quantum Entanglement** using two completely different personas:
1. **A 5-year-old teacher** - Simple, friendly, using everyday language
2. **A Nobel Prize Physicist** - Technical, precise, using advanced terminology

**Watch how the same model adapts its communication style!**


In [None]:
# Interactive Demo: Persona Challenge - Part A
# Task: Explain Quantum Entanglement as a 5-year-old teacher

prompt = "Explain Quantum Entanglement"

# Challenge A: 5-year-old teacher persona
# Fill in the system prompt to make the model act like a friendly teacher for young children
system_prompt_a = "You are a specialized teacher for 5-year-old children. You explain complex topics using simple words, fun analogies, and everyday examples. You are patient, enthusiastic, and make learning fun!"

print("=" * 80)
print("üé≠ CHALLENGE A: 5-Year-Old Teacher Persona")
print("=" * 80)
print(f"üìù System Prompt: {system_prompt_a}")
print(f"‚ùì User Question: {prompt}\n")
print("ü§ñ Model Response:")
print("-" * 80)
result_a = query_model(prompt, temperature=0.7, system_prompt=system_prompt_a)
print(result_a)
print("-" * 80)
print("\nüí≠ Notice: Simple language, analogies, and child-friendly explanations\n")


In [None]:
# Interactive Demo: Persona Challenge - Part B
# Task: Explain Quantum Entanglement as a Nobel Prize Physicist

prompt = "Explain Quantum Entanglement"

# Challenge B: Nobel Prize Physicist persona
# Fill in the system prompt to make the model act like an expert physicist
system_prompt_b = "You are a Nobel Prize-winning physicist with deep expertise in quantum mechanics. You use precise technical terminology, mathematical concepts, and advanced scientific language. You communicate with the precision and depth expected of a world-class researcher."

print("=" * 80)
print("üé≠ CHALLENGE B: Nobel Prize Physicist Persona")
print("=" * 80)
print(f"üìù System Prompt: {system_prompt_b}")
print(f"‚ùì User Question: {prompt}\n")
print("ü§ñ Model Response:")
print("-" * 80)
result_b = query_model(prompt, temperature=0.7, system_prompt=system_prompt_b)
print(result_b)
print("-" * 80)
print("\nüí≠ Notice: Technical jargon, precise terminology, and advanced concepts\n")


### üìä Comparison & Analysis

Now that you've seen both responses, let's analyze the differences:

**Key Differences to Notice:**

| Aspect | 5-Year-Old Teacher | Nobel Prize Physicist |
|--------|-------------------|----------------------|
| **Vocabulary** | Simple, everyday words | Technical, scientific terms |
| **Sentence Structure** | Short, clear sentences | Complex, detailed explanations |
| **Examples** | Relatable analogies (toys, games) | Mathematical and theoretical |
| **Tone** | Friendly, enthusiastic | Formal, authoritative |
| **Depth** | Surface-level concepts | Deep technical details |

### üéØ Reflection Questions

1. **Adaptation**: How well did the model adapt to each persona?
2. **Consistency**: Did it maintain the persona throughout the response?
3. **Effectiveness**: Which explanation would be better for different audiences?
4. **Real-World Use**: Where could system prompts be useful in your projects?

### üí° Key Insight

**The same model, the same question, but completely different responses!**

This demonstrates the power of system prompts. By simply changing the system prompt, you can:
- Create educational tools for different age groups
- Build specialized AI assistants (legal, medical, technical)
- Control brand voice and communication style
- Adapt content for different audiences automatically


---

## üéì Key Takeaways & Best Practices

### ‚úÖ What We Learned Today

1. **üå°Ô∏è Temperature Control**
   - Low temperature (0.1-0.3) ‚Üí Factual, consistent outputs
   - High temperature (0.7-1.0) ‚Üí Creative, varied outputs
   - **Best Practice**: Start with 0.7 for general use, adjust based on task

2. **üß† Chain-of-Thought (CoT) Prompting**
   - Break down complex problems into steps
   - Dramatically improves accuracy on reasoning tasks
   - **Best Practice**: Always use CoT for math, logic, and multi-step problems

3. **üé≠ System Prompts**
   - Control model behavior, tone, and expertise
   - Adapt content for different audiences
   - **Best Practice**: Define the role clearly and be specific about desired behavior

### üöÄ Prompt Engineering Best Practices

| Technique | When to Use | Example |
|-----------|-------------|---------|
| **Low Temperature** | Factual Q&A, code, math | `temperature=0.1` |
| **High Temperature** | Creative writing, brainstorming | `temperature=1.0` |
| **Chain-of-Thought** | Reasoning, calculations, logic | "Show me step by step..." |
| **System Prompts** | Role-playing, audience adaptation | "You are a [role]..." |

### üíº Real-World Applications

- **Customer Support**: System prompt sets helpful, professional tone
- **Educational Tools**: Adapt explanations for different grade levels
- **Content Creation**: Control writing style and voice
- **Code Generation**: Use low temperature + CoT for accurate code
- **Data Analysis**: CoT helps models reason through complex problems

### üéØ Next Steps for Practice

1. **Experiment**: Try different temperature values on the same prompt
2. **Practice CoT**: Apply step-by-step reasoning to other problems
3. **Create Personas**: Design system prompts for specific use cases
4. **Combine Techniques**: Use CoT + system prompts together
5. **Iterate**: Prompt engineering is iterative - refine and improve!

---

## üìö Additional Resources

- **Temperature Guide**: Experiment with values between 0.0 and 2.0
- **CoT Variations**: Try "think step by step", "show your work", "reason through this"
- **System Prompt Templates**: Create a library of effective personas
- **Multi-turn Conversations**: Build on previous responses for complex tasks

---

## üéâ Congratulations!

You've mastered the fundamentals of prompt engineering! These techniques will help you get the best results from any language model. Remember: **the quality of your prompts directly determines the quality of your outputs.**

**Happy Prompting! üöÄ**
