# LLM Prompt Engineering  

This hands-on tutorial will teach you advanced techniques for crafting effective prompts that maximize the performance of AI models like Llama-4-Maverick delivered by Meta's Llama API under research preview.


## Workshop Overview

In this workshop, we'll explore five core prompt engineering techniques using real-world examples:

1. **Zero-Shot Prompting**: Getting results without examples
2. **Few-Shot Prompting**: Learning from examples
3. **Chain of Thought Reasoning**: Step-by-step problem solving
4. **Role Prompting**: Giving the AI a specific persona
5. **Data Cleaning Applications**: Practical business use cases

Each section includes detailed explanations, code examples, and hands-on exercises that you can run immediately.

---

## Setup and Basic Usage

Before diving into advanced techniques, let's establish our foundation. We'll set up our environment and create a robust helper function that will be the backbone of all our experiments.

### Understanding the Llama API Structure

The Llama API follows the OpenAI-style chat completion format, where conversations are represented as lists of messages. Each message has a role (like 'user', 'system', or 'assistant') and content. This structure allows for rich, contextual interactions that can maintain conversation history and role-specific behavior.



In [None]:
#!pip install llama-api-client
import os
from llama_api_client import LlamaAPIClient

# Initialize the client
client = LlamaAPIClient(
    #api_key=os.environ.get("LLAMA_API_KEY")
    api_key="LLM|1465877834571093|6OzJ14guJ8G-LGcetXMwAx3kdnI" 
)

# Basic function to interact with the model
def call_llama(messages, model="Llama-4-Maverick-17B-128E-Instruct-FP8"):
    """
    Helper function to call Llama API with error handling
    """
    try:
        completion = client.chat.completions.create(
            model=model,
            messages=messages,
        )
        return completion.completion_message.content.text
    except Exception as e:
        return f"Error: {str(e)}"

# Test basic functionality
response = call_llama([
    {"role": "user", "content": "Hello, can you introduce yourself?"}
])
print("Basic Response:", response)

Basic Response: Hello! I'm Llama, a Meta-designed model here to adapt to your conversational style. Whether you need quick answers, deep dives into ideas, or just want to vent, joke or brainstorm—I'm here for it. What’s on your mind?



### Breaking Down Our Helper Function

The `call_llama` function we just created is more than a simple API wrapper—it's a robust foundation for prompt engineering:

**Key Features:**
- **Error Handling**: Catches API errors and network issues gracefully
- **Flexible Messaging**: Accepts any message format (user, system, assistant roles)
- **Model Selection**: Allows switching between different Llama models
- **Consistent Interface**: Standardizes how we interact with the API throughout the workshop

**Why This Matters:**
In production environments, API calls can fail for various reasons (network issues, rate limits, model availability). Our helper function ensures that your prompt engineering experiments won't crash unexpectedly, making it easier to iterate and refine your techniques.

The test call we make verifies that:
1. Your API key is correctly configured
2. The model is accessible
3. The basic communication pipeline works

---

## Zero-Shot Prompting

Zero-shot prompting is the foundation of prompt engineering. It's called "zero-shot" because you're asking the model to perform a task without providing any examples—you're giving it "zero shots" to learn from. The model relies entirely on its pre-training to understand and complete the task.

### When to Use Zero-Shot Prompting

Zero-shot prompting works best for:
- **Common tasks**: Things the model has seen many times during training
- **Well-defined problems**: Tasks with clear, unambiguous requirements
- **Quick prototyping**: When you need fast results without setup time
- **Simple classifications**: Basic categorization or labeling tasks

### The Anatomy of a Good Zero-Shot Prompt

Effective zero-shot prompts typically include:
1. **Clear task description**: What you want the model to do
2. **Input specification**: What data you're providing
3. **Output format**: How you want the response structured
4. **Context or constraints**: Any important limitations or requirements

### Simple Zero-Shot Example: Sentiment Classification



In [None]:
def zero_shot_classification():
    """
    Demonstrate zero-shot text classification
    """
    messages = [
        {
            "role": "user", 
            "content": """Classify the following text as either 'positive', 'negative', or 'neutral':

Text: "I absolutely love this new restaurant! The food was amazing and the service was excellent."

Classification:"""
        }
    ]
    
    response = call_llama(messages)
    print("Zero-shot Classification Result:")
    print(response)

zero_shot_classification()

Zero-shot Classification Result:
The classification of the given text is: **positive**.

The text contains strong positive language, such as "absolutely love", "amazing", and "excellent", indicating a very favorable opinion of the restaurant.


In [22]:
def zero_shot_classification_refined():
    """
    Demonstrate zero-shot text classification
    """
    messages = [
        {
            "role": "user", 
            "content": """Classify the following text as either 'positive', 'negative', or 'neutral'. Only output one word:

Text: "I absolutely love this new restaurant! The food was amazing and the service was excellent."

Classification:"""
        }
    ]
    
    response = call_llama(messages)
    print("Zero-shot Classification Result:")
    print(response)

zero_shot_classification_refined()

Zero-shot Classification Result:
Positive



**Why This Example Works**: The model has seen countless examples of sentiment analysis during training, so it can reliably identify emotional tone without additional guidance.

### Zero-Shot Question Answering: Working with Context

Question answering with context is a more complex zero-shot task that demonstrates the model's reading comprehension abilities.



In [23]:
def zero_shot_qa():
    """
    Demonstrate zero-shot question answering with context
    """
    messages = [
        {
            "role": "user",
            "content": """Answer the following question based on the given context:

Context: "Python is a high-level programming language created by Guido van Rossum in 1991. It emphasizes code readability and simplicity, making it popular for beginners and experts alike. Python supports multiple programming paradigms including procedural, object-oriented, and functional programming."

Question: Who created Python and when?

Answer:"""
        }
    ]
    
    response = call_llama(messages)
    print("Zero-shot QA Result:")
    print(response)

zero_shot_qa()

Zero-shot QA Result:
Guido van Rossum in 1991.


**Key Insight**: Zero-shot QA works well when the answer is explicitly stated in the context. For more complex reasoning or implicit information, you might need few-shot examples or chain of thought prompting.

### Zero-Shot Limitations and Solutions

While powerful, zero-shot prompting has limitations:
- **Inconsistent formatting**: Without examples, output format may vary
- **Ambiguous tasks**: Complex or unusual tasks may be misinterpreted
- **Domain-specific knowledge**: Specialized fields may require more guidance

When zero-shot prompting isn't sufficient, that's when few-shot prompting becomes valuable.

---

## Few-Shot Prompting

Few-shot prompting is like teaching by example. Instead of just describing what you want, you show the model a few examples of the input-output pattern you're looking for. This technique dramatically improves performance on tasks where zero-shot prompting produces inconsistent or suboptimal results.

### The Psychology of Few-Shot Learning

Few-shot prompting works because it:
- **Establishes patterns**: Shows the model exactly what "good" looks like
- **Reduces ambiguity**: Eliminates guesswork about output format
- **Provides context**: Helps the model understand edge cases and nuances
- **Enables consistency**: Creates a template for reliable, standardized responses

### How Many Examples Should You Use?

The "few" in few-shot typically means:
- **2-3 examples**: Often sufficient for simple tasks
- **3-5 examples**: Good for more complex patterns
- **5+ examples**: Rarely needed, may hit token limits

**Quality over quantity**: Three diverse, high-quality examples usually outperform ten similar ones.

### Few-Shot Classification: Building on Zero-Shot



In [24]:
def few_shot_classification():
    """
    Demonstrate few-shot learning for sentiment classification
    """
    messages = [
        {
            "role": "user",
            "content": """Classify the following texts as 'positive', 'negative', or 'neutral':

Example 1:
Text: "This movie was terrible, I want my money back."
Classification: negative

Example 2:
Text: "The weather is okay today, not too hot or cold."
Classification: neutral

Example 3:
Text: "I'm so excited about my vacation next week!"
Classification: positive

Now classify this text:
Text: "The customer service was disappointing and unhelpful."
Classification:"""
        }
    ]
    
    response = call_llama(messages)
    print("Few-shot Classification Result:")
    print(response)
    
few_shot_classification()

Few-shot Classification Result:
The classification of the text: "The customer service was disappointing and unhelpful." is: negative.

The text expresses a clear negative sentiment towards the customer service, using words like "disappointing" and "unhelpful" to convey a unfavorable opinion.


### Few-Shot Data Extraction: Structured Output

Data extraction tasks particularly benefit from few-shot prompting because they require consistent, structured output formats.



In [25]:
def few_shot_extraction():
    """
    Demonstrate few-shot learning for structured data extraction
    """
    messages = [
        {
            "role": "user",
            "content": """Extract name, age, and occupation from the following texts:

Example 1:
Text: "Hi, I'm Sarah Johnson, a 28-year-old software engineer."
Extraction: {"name": "Sarah Johnson", "age": 28, "occupation": "software engineer"}

Example 2:
Text: "My name is Michael Chen and I'm 35. I work as a doctor."
Extraction: {"name": "Michael Chen", "age": 35, "occupation": "doctor"}

Now extract from this text:
Text: "Hello, I'm Emma Davis, I'm 32 years old and I'm a marketing manager."
Extraction:"""
        }
    ]
    
    response = call_llama(messages)
    print("Few-shot Extraction Result:")
    print(response)

few_shot_extraction()

Few-shot Extraction Result:
Based on the provided text, the extraction is:

{"name": "Emma Davis", "age": 32, "occupation": "marketing manager"}


### Best Practices for Few-Shot Example Selection

When choosing examples for few-shot prompting:

**Diversity is crucial**: 
- Different input formats and styles
- Various edge cases and scenarios
- Representative of real-world data variation

**Quality matters**: 
- Perfect examples teach perfect patterns
- Consistent formatting across all examples
- Clear, unambiguous input-output relationships

**Strategic coverage**: 
- Include edge cases you expect to encounter
- Show how to handle missing or unclear information
- Demonstrate the desired level of detail

---

## Chain of Thought Reasoning

Chain of Thought (CoT) prompting is a revolutionary technique that encourages models to "think out loud" by showing their reasoning process step by step. This approach is particularly powerful for complex problems that require multi-step reasoning, mathematical calculations, or logical analysis.

### Why Chain of Thought Works

Traditional prompting often produces answers without explanation, like a student showing only the final answer on a math test. CoT prompting is like asking the student to "show their work"—and remarkably, this improves accuracy even when the model isn't specifically trained for reasoning.

**The Science Behind CoT:**
- **Decomposition**: Complex problems become manageable sub-problems
- **Error reduction**: Step-by-step reasoning catches logical errors
- **Transparency**: You can see where and why the model might go wrong
- **Confidence building**: Explicit reasoning builds trust in the output

### CoT with Non-Reasoning Models

While Llama 4 Maverick isn't specifically designed as a reasoning model (like GPT-O3 or Gemini 2.5 Pro), we can still elicit step-by-step thinking through careful prompt engineering. The key is creating a framework that encourages methodical problem-solving.

### Mathematical Problem Solving with CoT



In [26]:
def chain_of_thought_math():
    """
    Demonstrate chain of thought reasoning for math problems
    """
    messages = [
        {
            "role": "user",
            "content": """Solve this step by step, showing your reasoning:

Problem: A store has 45 apples. They sell 18 apples in the morning and 12 apples in the afternoon. How many apples are left?

Let me think through this step by step:
1) First, I need to find the total number of apples sold
2) Then subtract that from the original amount

Step-by-step solution:"""
        }
    ]
    
    response = call_llama(messages)
    print("Chain of Thought Math Result:")
    print(response)

chain_of_thought_math()

Chain of Thought Math Result:
To solve the problem, let's follow the steps you've outlined.

**Step 1: Find the total number of apples sold**

The store sells 18 apples in the morning and 12 apples in the afternoon. To find the total number of apples sold, we need to add these two numbers together.

Total apples sold = Apples sold in the morning + Apples sold in the afternoon
Total apples sold = 18 + 12
Total apples sold = 30

**Step 2: Subtract the total number of apples sold from the original amount**

The store originally had 45 apples. We now know that they sold 30 apples in total. To find out how many apples are left, we need to subtract the total number of apples sold from the original amount.

Apples left = Original number of apples - Total apples sold
Apples left = 45 - 30
Apples left = 15

Therefore, there are **15 apples left** in the store.


### Logical Reasoning with CoT

CoT is particularly valuable for logical puzzles and philosophical problems where the reasoning process is as important as the conclusion.



In [27]:
def chain_of_thought_logic():
    """
    Demonstrate chain of thought for logical reasoning
    """
    messages = [
        {
            "role": "user",
            "content": """Analyze this logical problem step by step:

Problem: All birds can fly. Penguins are birds. Can penguins fly?

Please think through this carefully:
1) First, identify the given statements
2) Then apply logical reasoning
3) Finally, provide your conclusion with explanation

Analysis:"""
        }
    ]
    
    response = call_llama(messages)
    print("Chain of Thought Logic Result:")
    print(response)

chain_of_thought_logic()

Chain of Thought Logic Result:
## Step 1: Identify the given statements
The given statements are: "All birds can fly" and "Penguins are birds." These are the premises of the argument.

## Step 2: Apply logical reasoning
To determine if penguins can fly, we need to apply the given premises to the question. The first premise states that all birds can fly, and the second premise states that penguins are birds. Using logical deduction, if penguins are classified as birds and all birds can fly, then it logically follows that penguins can fly according to the given premises.

## Step 3: Evaluate the conclusion based on the premises
However, we must also consider the real-world truth and the validity of the premises. The statement "All birds can fly" is not true in reality because there are birds, like penguins and ostriches, that cannot fly. Despite this, within the context of the given problem, we are to assume the premises are true.

## Step 4: Provide the conclusion with explanation
Based

### Complex Multi-Step CoT Reasoning

For more sophisticated problems, we can create detailed reasoning frameworks that guide the model through complex calculations and decisions.



In [28]:
def complex_chain_of_thought():
    """
    Demonstrate complex reasoning with explicit step structure
    """
    messages = [
        {
            "role": "user",
            "content": """Solve this problem by thinking through each step:

Problem: A company has 100 employees. 60% work in sales, 25% work in engineering, and the rest work in administration. If the company hires 20 new sales people and 10 new engineers, what percentage of the total workforce will be in sales?

Please solve this step by step:
Step 1: Calculate current distribution
Step 2: Calculate new hires impact
Step 3: Calculate new total and sales percentage

Solution:"""
        }
    ]
    
    response = call_llama(messages)
    print("Complex Chain of Thought Result:")
    print(response)

complex_chain_of_thought()

Complex Chain of Thought Result:
To solve this problem, we'll break it down into the required steps.

### Step 1: Calculate current distribution

First, let's determine how many employees are currently working in sales, engineering, and administration.

- Total employees = 100
- Employees in sales = 60% of 100 = 0.6 * 100 = 60
- Employees in engineering = 25% of 100 = 0.25 * 100 = 25
- Employees in administration = 100% - (60% + 25%) = 15% of 100 = 0.15 * 100 = 15

So, currently, there are 60 employees in sales, 25 in engineering, and 15 in administration.

### Step 2: Calculate new hires impact

The company is hiring 20 new salespeople and 10 new engineers.

- New employees in sales = 60 (current) + 20 (new) = 80
- New employees in engineering = 25 (current) + 10 (new) = 35
- Employees in administration remain the same = 15

### Step 3: Calculate new total and sales percentage

Now, let's calculate the new total number of employees and the percentage of employees in sales.

- New tota

### CoT Best Practices and Guidelines

**Effective CoT Framework Design:**

1. **Clear step enumeration**: Number or label each reasoning step
2. **Logical flow**: Each step should build naturally on previous steps
3. **Explicit instructions**: Tell the model what to think about at each stage
4. **Verification points**: Include steps that allow checking intermediate results

**When to Use CoT:**
- Mathematical calculations with multiple steps
- Logical reasoning and analysis
- Complex decision-making scenarios
- Problems where showing work adds value
- Debugging and error analysis tasks

**CoT Limitations:**
- Longer responses (more tokens/cost)
- May over-complicate simple problems
- Can hallucinate reasoning steps
- Requires careful framework design

---

## Role Prompting with System Messages

Role prompting is a powerful technique that assigns the AI model a specific persona, expertise area, or behavioral pattern. By giving the model a "role to play," you can dramatically influence not just what it says, but how it says it, what knowledge it emphasizes, and what style it adopts.

### The Psychology of Role Prompting

Role prompting works because:
- **Context activation**: The model activates knowledge associated with that role
- **Style adaptation**: Communication style matches the assigned persona
- **Behavioral consistency**: The role provides behavioral guidelines
- **Expertise focus**: Relevant domain knowledge is prioritized

### System Messages vs User Messages

**System Messages**: 
- Set persistent behavioral guidelines
- Establish the model's persona and constraints
- Remain active throughout the conversation
- Less likely to be overridden by user instructions

**User Messages**: 
- Provide specific tasks and questions
- Can include role information but less persistent
- More easily modified or contradicted
- Better for one-off instructions

### Expert Role Prompting: Technical Authority



In [29]:
def expert_role_prompting():
    """
    Demonstrate role prompting by assigning expert persona
    """
    messages = [
        {
            "role": "system",
            "content": "You are a senior data scientist with 10 years of experience in machine learning and statistics. You explain complex concepts clearly and provide practical insights."
        },
        {
            "role": "user",
            "content": "Explain the bias-variance tradeoff in machine learning and how it affects model performance."
        }
    ]
    
    response = call_llama(messages)
    print("Expert Role Response:")
    print(response)

expert_role_prompting()

Expert Role Response:
The bias-variance tradeoff! One of the most fundamental concepts in machine learning. As a senior data scientist, I'm happy to break it down for you.

**What is the bias-variance tradeoff?**

The bias-variance tradeoff is a property of supervised learning models that describes the inherent tension between two types of errors: bias and variance. In essence, it's a tradeoff between how well a model fits the training data (bias) and how well it generalizes to new, unseen data (variance).

**Bias**

Bias refers to the difference between the model's expected prediction and the true value. A model with high bias pays little attention to the training data and oversimplifies the relationship between the features and target variable. As a result, it tends to underfit the data, failing to capture important patterns and relationships. Think of bias as a systematic error that occurs when a model is too simple or makes too many assumptions about the data.

**Variance**

Varian


### Analyzing Expert Role Effectiveness

This expert role example demonstrates several key principles:

**Credibility Establishment**: 
- "Senior data scientist with 10 years of experience" creates authority
- Specific expertise areas (ML and statistics) focus the knowledge domain
- Professional background implies practical, not just theoretical, knowledge

**Communication Style Guidelines**: 
- "Explain complex concepts clearly" sets an accessibility expectation
- "Provide practical insights" emphasizes actionable advice
- Balances technical depth with clarity

**Expected Behavioral Changes**: 
With this role, you should expect:
- More technical terminology used appropriately
- References to practical applications and real-world scenarios
- Structured explanations with clear examples
- Confidence in technical assertions

**Topic Complexity**: 
The bias-variance tradeoff is a sophisticated ML concept that benefits from expert-level explanation. The role ensures the response will be authoritative and comprehensive rather than superficial.

### Teacher Role Prompting: Educational Focus



In [30]:
def teacher_role_prompting():
    """
    Demonstrate role prompting as an educational instructor
    """
    messages = [
        {
            "role": "system",
            "content": "You are a patient and encouraging programming teacher. You break down complex concepts into simple steps, provide examples, and always check for understanding."
        },
        {
            "role": "user",
            "content": "I'm a beginner. Can you explain what a function is in programming?"
        }
    ]
    
    response = call_llama(messages)
    print("Teacher Role Response:")
    print(response)

teacher_role_prompting()

Teacher Role Response:
I'm so glad you're starting your programming journey! I'd be happy to help you understand what a function is.

**What is a Function?**

In programming, a function is a block of code that performs a specific task. Think of it like a recipe: you put in some ingredients (inputs), follow the steps (code), and get a result (output).

**Breaking it Down**

1. **Inputs**: A function can take in values, called arguments or parameters, which are used to perform the task.
2. **Code**: The function contains a set of instructions that are executed when the function is called.
3. **Output**: A function can return a result, which can be used by the rest of the program.

**Example**

Let's consider a simple example: a function that greets someone by name.

* Input: a person's name (e.g., "John")
* Code: the function uses the input name to construct a greeting message
* Output: the greeting message (e.g., "Hello, John!")

In code, this might look like:
```python
def greet(name):


### Educational Role Design Principles

The teacher role demonstrates different communication priorities:

**Pedagogical Approach**: 
- "Patient and encouraging" sets emotional tone
- "Break down complex concepts into simple steps" defines methodology
- "Always check for understanding" ensures comprehension

**Beginner-Friendly Characteristics**: 
- Avoids jargon without explanation
- Uses analogies and real-world examples
- Provides step-by-step progression
- Encourages questions and interaction

**Contrast with Expert Role**: 
- **Expert**: Assumes knowledge, provides depth, uses technical language
- **Teacher**: Assumes no prior knowledge, builds understanding gradually
- **Expert**: Efficient, comprehensive coverage
- **Teacher**: Patient, thorough explanation of basics

**Question Choice**: 
"What is a function in programming?" is fundamental but can be explained at many levels. The teacher role ensures an accessible, beginner-appropriate explanation.

### Business Analyst Role: Structured Professional Output



In [11]:
def analyst_role_prompting():
    """
    Demonstrate role prompting with specific constraints and style
    """
    messages = [
        {
            "role": "system",
            "content": "You are a business analyst who always provides structured responses with clear headings, bullet points, and actionable recommendations. You focus on practical business implications."
        },
        {
            "role": "user",
            "content": "Analyze the pros and cons of implementing a 4-day work week in a tech company."
        }
    ]
    
    response = call_llama(messages)
    print("Analyst Role Response:")
    print(response)
    return response

analyst_role_prompting()

Analyst Role Response:
**4-Day Work Week Analysis: Tech Company Implementation**

### Executive Summary

Implementing a 4-day work week in a tech company can have both positive and negative impacts on the organization. This analysis highlights the key pros and cons, providing a comprehensive overview to inform business decisions.

### Pros of a 4-Day Work Week

* **Increased Employee Satisfaction and Retention**
	+ Improved work-life balance
	+ Enhanced productivity and focus during working hours
	+ Potential reduction in turnover rates
* **Cost Savings**
	+ Reduced overhead costs (e.g., utilities, facilities maintenance)
	+ Lower commuting costs for employees
* **Talent Attraction and Competitive Advantage**
	+ Unique benefit to attract top talent in a competitive job market
	+ Differentiation from traditional 5-day work week companies
* **Environmental Benefits**
	+ Reduced carbon footprint due to decreased commuting

### Cons of a 4-Day Work Week

* **Potential Impact on Customer Se

'**4-Day Work Week Analysis: Tech Company Implementation**\n\n### Executive Summary\n\nImplementing a 4-day work week in a tech company can have both positive and negative impacts on the organization. This analysis highlights the key pros and cons, providing a comprehensive overview to inform business decisions.\n\n### Pros of a 4-Day Work Week\n\n* **Increased Employee Satisfaction and Retention**\n\t+ Improved work-life balance\n\t+ Enhanced productivity and focus during working hours\n\t+ Potential reduction in turnover rates\n* **Cost Savings**\n\t+ Reduced overhead costs (e.g., utilities, facilities maintenance)\n\t+ Lower commuting costs for employees\n* **Talent Attraction and Competitive Advantage**\n\t+ Unique benefit to attract top talent in a competitive job market\n\t+ Differentiation from traditional 5-day work week companies\n* **Environmental Benefits**\n\t+ Reduced carbon footprint due to decreased commuting\n\n### Cons of a 4-Day Work Week\n\n* **Potential Impact on Cu


### Professional Role Engineering

The business analyst role showcases advanced role prompting techniques:

**Dual Role Specification**: 
- **Professional identity**: "Business analyst" sets domain expertise
- **Communication style**: Structured responses with specific formatting

**Format Requirements**: 
- "Clear headings" - organizes information hierarchically
- "Bullet points" - makes content scannable and digestible
- "Actionable recommendations" - ensures practical utility

**Business Focus**: 
- "Practical business implications" prioritizes real-world impact
- Emphasizes decision-making utility over theoretical analysis
- Results-oriented rather than academic

**Expected Output Structure**: 
With this role, expect responses like:
- **Executive Summary**
- **Pros and Cons** (bulleted lists)
- **Key Considerations**
- **Recommendations** (specific, actionable)
- **Implementation Notes**

### Role Prompting Best Practices

**Effective Role Design**: 

1. **Specific expertise**: Define clear knowledge domains
2. **Communication style**: Specify how the role should "speak"
3. **Behavioral guidelines**: Include personality and approach characteristics
4. **Output format**: Define structure and presentation requirements
5. **Constraints**: Specify what the role should avoid or emphasize

**Role Selection Strategy**: 
- **Match expertise to task complexity**: Technical roles for technical tasks
- **Consider your audience**: Teacher role for beginners, expert role for specialists
- **Define output needs**: Analyst role for structured business documents
- **Think about tone**: Professional, casual, academic, creative

**Common Role Types**: 
- **Subject matter experts**: Doctor, lawyer, engineer, scientist
- **Communication specialists**: Teacher, journalist, technical writer
- **Business professionals**: Analyst, consultant, manager, strategist
- **Creative roles**: Writer, designer, brainstorming partner

---

## Data Cleaning with Prompt Engineering

Data cleaning is one of the most practical and immediately valuable applications of prompt engineering in business contexts. Real-world data is messy, inconsistent, and often requires significant preprocessing before it can be used for analysis or machine learning. This section demonstrates how different prompt engineering techniques can be applied to transform chaotic data into clean, standardized formats.

### Why Data Cleaning Matters

Data scientists often spend 60-80% of their time cleaning and preparing data. Common data quality issues include:
- **Inconsistent formatting**: Names in different cases, phone numbers in various formats
- **Missing values**: Empty fields, "unknown" entries, inconsistent null representations
- **Type inconsistencies**: Ages as words instead of numbers, dates in different formats
- **Standardization needs**: Job titles that mean the same thing but are written differently

### The Power of LLMs for Data Cleaning

Large Language Models excel at data cleaning because they:
- **Understand context**: Can infer what "thirty-two" means in an age field
- **Handle ambiguity**: Can make reasonable decisions about edge cases
- **Recognize patterns**: Can standardize variations of the same information
- **Apply business logic**: Can make domain-specific cleaning decisions

### Working with Pandas DataFrames

Throughout this section, we'll use pandas DataFrames because they're the standard tool for data manipulation in Python. This approach demonstrates how prompt engineering integrates with existing data science workflows.

### Creating Our Messy Dataset

Let's start by creating a dataset that represents common real-world data quality issues.



In [12]:
import pandas as pd
import json

def create_messy_dataframe():
    """
    Create a sample messy dataset as a pandas DataFrame
    """
    messy_data = {
        'raw_record': [
            "john doe, 25, software engineer, john.doe@email.com, 555-123-4567",
            "JANE SMITH,, marketing manager, jane@company.co, (555) 987-6543",
            "Bob Johnson, thirty-two, Sales Rep, bob.johnson@gmail.com, 555.321.9876",
            "mary williams, 28, Data Scientist, mary.williams@company.com, 555-456-7890",
            "MIKE BROWN, 45, HR Manager, mike@hr.company.com, +1-555-234-5678",
            "sarah davis, unknown, developer, sarah.davis@tech.com, 5551234567",
            "Tom Wilson, 35, Product Manager, tom.wilson@company.com, 555-876-5432",
            "lisa garcia, 29, UX Designer, lisa@design.com, (555) 345-6789"
        ]
    }
    
    df = pd.DataFrame(messy_data)
    df['record_id'] = range(1, len(df) + 1)
    
    print("Messy Dataset as DataFrame:")
    print(df.to_string(index=False))
    print(f"\nDataset shape: {df.shape}")
    
    return df

messy_df = create_messy_dataframe()

Messy Dataset as DataFrame:
                                                                raw_record  record_id
         john doe, 25, software engineer, john.doe@email.com, 555-123-4567          1
           JANE SMITH,, marketing manager, jane@company.co, (555) 987-6543          2
   Bob Johnson, thirty-two, Sales Rep, bob.johnson@gmail.com, 555.321.9876          3
mary williams, 28, Data Scientist, mary.williams@company.com, 555-456-7890          4
          MIKE BROWN, 45, HR Manager, mike@hr.company.com, +1-555-234-5678          5
         sarah davis, unknown, developer, sarah.davis@tech.com, 5551234567          6
     Tom Wilson, 35, Product Manager, tom.wilson@company.com, 555-876-5432          7
             lisa garcia, 29, UX Designer, lisa@design.com, (555) 345-6789          8

Dataset shape: (8, 2)



### Understanding Our Data Quality Issues

Our sample dataset contains typical real-world problems:

**Naming Inconsistencies**: 
- "john doe" (all lowercase)
- "JANE SMITH" (all uppercase)
- "Bob Johnson" (proper case)
- "mary williams" (lowercase)

**Age Format Variations**: 
- "25" (numeric)
- "" (missing/empty)
- "thirty-two" (written out)
- "unknown" (explicit unknown value)

**Job Title Inconsistencies**: 
- "software engineer" vs "developer" (same role, different terms)
- "Sales Rep" vs "marketing manager" (different cases)
- "Data Scientist" (proper case)

**Phone Number Chaos**: 
- "555-123-4567" (standard format)
- "(555) 987-6543" (parentheses format)
- "555.321.9876" (dot separators)
- "+1-555-234-5678" (international format)
- "5551234567" (no separators)

**Email Variations**: 
- Different domains and formats
- Mixed case in usernames

This diversity mirrors what you'll encounter in real business data, making our examples immediately applicable to practical scenarios.

### Helper Functions for DataFrame Integration



In [13]:
def parse_cleaned_json(response_text):
    """
    Extract JSON from LLM response and handle parsing errors
    """
    try:
        # Try to find JSON in the response
        start_idx = response_text.find('{')
        end_idx = response_text.rfind('}') + 1
        
        if start_idx != -1 and end_idx != 0:
            json_str = response_text[start_idx:end_idx]
            return json.loads(json_str)
        else:
            return None
    except json.JSONDecodeError:
        return None

def apply_cleaning_to_dataframe(df, cleaning_function, column_name='raw_record'):
    """
    Apply a cleaning function to each row in the DataFrame
    """
    cleaned_results = []
    
    for idx, row in df.iterrows():
        print(f"Processing record {idx + 1}...")
        response = cleaning_function(row[column_name])
        parsed_json = parse_cleaned_json(response)
        
        if parsed_json:
            cleaned_results.append(parsed_json)
        else:
            # Fallback for unparseable responses
            cleaned_results.append({
                'name': None,
                'age': None,
                'job_title': None,
                'email': None,
                'phone': None,
                'error': 'Failed to parse response'
            })
    
    return pd.DataFrame(cleaned_results)

print("Helper functions defined for DataFrame operations.")

Helper functions defined for DataFrame operations.



### Understanding Our Data Processing Pipeline

These helper functions create a robust data processing framework:

**JSON Parsing with Error Handling**: 
The `parse_cleaned_json` function handles the common challenge of extracting structured data from LLM responses:
- **Flexible parsing**: Finds JSON anywhere in the response text
- **Error resilience**: Returns None instead of crashing on invalid JSON
- **Real-world adaptation**: LLMs sometimes include explanatory text around the JSON

**DataFrame Integration**: 
The `apply_cleaning_to_dataframe` function bridges prompt engineering with pandas workflows:
- **Row-by-row processing**: Applies cleaning functions to each record
- **Progress tracking**: Shows processing status for long datasets
- **Fallback handling**: Creates error records when parsing fails
- **Consistent output**: Always returns a DataFrame, even with errors

**Production Considerations**: 
- **Error tracking**: Captures failures without stopping the entire process
- **Data integrity**: Preserves original data while adding cleaned versions
- **Scalability**: Framework can handle datasets of various sizes

### Zero-Shot Data Cleaning: Direct Instruction Approach



In [14]:
def zero_shot_data_cleaning_single(record):
    """
    Use zero-shot prompting to clean a single data record
    """
    messages = [
        {
            "role": "user",
            "content": f"""Clean and standardize this data record. Return ONLY valid JSON with fields: name, age, job_title, email, phone.

Raw data: "{record}"

Requirements:
- Name: Title case (First Last)
- Age: Number only (if unknown, use null)
- Job title: Title case
- Email: Lowercase
- Phone: Format as XXX-XXX-XXXX

Return only the JSON object:"""
        }
    ]
    
    return call_llama(messages)

# Apply zero-shot cleaning to first 3 records
sample_df = messy_df.head(3).copy()
cleaned_sample = apply_cleaning_to_dataframe(sample_df, zero_shot_data_cleaning_single)

print("\nOriginal records:")
print(sample_df[['record_id', 'raw_record']].to_string(index=False))
print("\nCleaned records:")
print(cleaned_sample.to_string(index=False))

# Combine original and cleaned data
result_df = pd.concat([sample_df.reset_index(drop=True), cleaned_sample], axis=1)
print("\nCombined DataFrame:")
print(result_df.to_string(index=False))

Processing record 1...
Processing record 2...
Processing record 3...

Original records:
 record_id                                                              raw_record
         1       john doe, 25, software engineer, john.doe@email.com, 555-123-4567
         2         JANE SMITH,, marketing manager, jane@company.co, (555) 987-6543
         3 Bob Johnson, thirty-two, Sales Rep, bob.johnson@gmail.com, 555.321.9876

Cleaned records:
       name  age         job_title                 email        phone
   John Doe 25.0 Software Engineer    john.doe@email.com 555-123-4567
 Jane Smith  NaN Marketing Manager       jane@company.co 555-987-6543
Bob Johnson  NaN         Sales Rep bob.johnson@gmail.com 555-321-9876

Combined DataFrame:
                                                             raw_record  record_id        name  age         job_title                 email        phone
      john doe, 25, software engineer, john.doe@email.com, 555-123-4567          1    John Doe 25.0 Software


### Zero-Shot Data Cleaning Analysis

This zero-shot approach demonstrates several important principles:

**Clear Requirements Specification**: 
- **Output format**: "Return ONLY valid JSON" eliminates ambiguity
- **Field specification**: Lists exact field names expected
- **Format standards**: Defines specific formatting rules for each field type

**Standardization Rules**: 
- **Name**: Title case ensures consistent capitalization
- **Age**: Number or null handles various input formats
- **Job title**: Title case standardizes professional titles
- **Email**: Lowercase follows internet standards
- **Phone**: XXX-XXX-XXXX creates uniform format

**Business Logic Integration**: 
The prompt embeds common business rules:
- Handle "unknown" as null values
- Standardize phone formats for database storage
- Apply consistent capitalization for professional presentation

**Why This Works**: 
Zero-shot cleaning leverages the model's training on similar data transformation tasks. The clear specification helps the model understand the exact transformation needed.

**DataFrame Integration Benefits**: 
- **Side-by-side comparison**: Original and cleaned data are easily comparable
- **Quality assessment**: You can immediately see cleaning effectiveness
- **Further processing**: Cleaned DataFrames integrate with existing pandas workflows

### Few-Shot Data Cleaning: Learning from Examples



In [15]:
def few_shot_data_cleaning_batch(df, num_examples=2):
    """
    Use few-shot prompting to clean multiple records with examples
    """
    # Use first few records as examples
    examples = df.head(num_examples)
    target_records = df.iloc[num_examples:num_examples+3]  # Next 3 records
    
    # Create examples text
    examples_text = ""
    for i, (_, row) in enumerate(examples.iterrows(), 1):
        examples_text += f"""Example {i}:
Raw: "{row['raw_record']}"
Cleaned: {{"name": "John Doe", "age": 25, "job_title": "Software Engineer", "email": "john.doe@email.com", "phone": "555-123-4567"}}

"""
    
    cleaned_results = []
    
    for _, row in target_records.iterrows():
        messages = [
            {
                "role": "user",
                "content": f"""Clean and standardize data records following these examples:

{examples_text}
Now clean this record (return ONLY the JSON):
Raw: "{row['raw_record']}"
Cleaned:"""
            }
        ]
        
        response = call_llama(messages)
        parsed_json = parse_cleaned_json(response)
        
        if parsed_json:
            cleaned_results.append(parsed_json)
        else:
            cleaned_results.append({'error': 'Failed to parse'})
    
    cleaned_df = pd.DataFrame(cleaned_results)
    
    print("Few-shot Data Cleaning Results:")
    print("\nTarget records:")
    print(target_records[['record_id', 'raw_record']].to_string(index=False))
    print("\nCleaned results:")
    print(cleaned_df.to_string(index=False))
    
    return cleaned_df

few_shot_results = few_shot_data_cleaning_batch(messy_df)

Few-shot Data Cleaning Results:

Target records:
 record_id                                                                 raw_record
         3    Bob Johnson, thirty-two, Sales Rep, bob.johnson@gmail.com, 555.321.9876
         4 mary williams, 28, Data Scientist, mary.williams@company.com, 555-456-7890
         5           MIKE BROWN, 45, HR Manager, mike@hr.company.com, +1-555-234-5678

Cleaned results:
         name  age            job_title                     email           phone
  Bob Johnson   32 Sales Representative     bob.johnson@gmail.com    555-321-9876
Mary Williams   28       Data Scientist mary.williams@company.com    555-456-7890
   Mike Brown   45           HR Manager       mike@hr.company.com +1-555-234-5678



### Few-Shot Data Cleaning Advantages

The few-shot approach offers significant improvements over zero-shot:

**Pattern Learning**: 
- **Concrete examples**: Shows exact input-output transformations
- **Format consistency**: Establishes precise JSON structure
- **Edge case handling**: Examples demonstrate how to handle missing data

**Example Strategy**: 
- **Example 1**: Clean, standard input → standard output
- **Example 2**: Missing age field → null value handling
- **Coverage**: Examples represent different types of data quality issues

**Consistency Benefits**: 
- **Reduced variation**: Examples constrain output format
- **Standardized field names**: Prevents field naming inconsistencies
- **Predictable data types**: Examples establish type conventions

**Business Value**: 
Few-shot cleaning is particularly valuable when:
- You need consistent output across large datasets
- Your data has domain-specific cleaning requirements
- You want to establish organizational data standards

**Scalability Considerations**: 
While this example processes records individually, the pattern established by examples ensures consistency across the entire dataset.

### Chain of Thought Data Cleaning: Reasoning Through Decisions



In [16]:
def cot_data_cleaning_with_reasoning(df, record_index=5):
    """
    Use chain of thought reasoning for complex data cleaning decisions
    """
    target_record = df.iloc[record_index]
    
    messages = [
        {
            "role": "user",
            "content": f"""Clean this data record step by step, explaining your reasoning:

Record ID: {target_record['record_id']}
Raw data: "{target_record['raw_record']}"

Please follow this process:
1. Parse the fields (identify name, age, job, email, phone)
2. Standardize name (proper case)
3. Handle age (convert to number or null if unknown)
4. Standardize job title (proper case)
5. Standardize email (lowercase)
6. Standardize phone (XXX-XXX-XXXX format)
7. Output final JSON

Show your step-by-step reasoning, then provide the final JSON:"""
        }
    ]
    
    response = call_llama(messages)
    print(f"Chain of Thought Data Cleaning for Record {target_record['record_id']}:")
    print(f"Raw: {target_record['raw_record']}")
    print("\nLLM Response with Reasoning:")
    print(response)
    
    # Extract the final JSON
    parsed_json = parse_cleaned_json(response)
    if parsed_json:
        cleaned_df = pd.DataFrame([parsed_json])
        print("\nExtracted cleaned data:")
        print(cleaned_df.to_string(index=False))
        return cleaned_df
    else:
        print("\nFailed to extract JSON from response")
        return None

cot_result = cot_data_cleaning_with_reasoning(messy_df, record_index=5)

Chain of Thought Data Cleaning for Record 6:
Raw: sarah davis, unknown, developer, sarah.davis@tech.com, 5551234567

LLM Response with Reasoning:
Let's go through the steps to clean the data record.

### Step 1: Parse the fields

The raw data is: "sarah davis, unknown, developer, sarah.davis@tech.com, 5551234567"

We can identify the fields as follows:

- Name: "sarah davis"
- Age: "unknown"
- Job: "developer"
- Email: "sarah.davis@tech.com"
- Phone: "5551234567"

### Step 2: Standardize name (proper case)

The name is "sarah davis". To standardize it to proper case, we capitalize the first letter of each word.

- Standardized Name: "Sarah Davis"

### Step 3: Handle age (convert to number or null if unknown)

The age is given as "unknown". Since it's not a valid number, we will convert it to `null`.

- Age: `null`

### Step 4: Standardize job title (proper case)

The job title is "developer". To standardize it to proper case, we capitalize the first letter.

- Standardized Job: "Develo


### Chain of Thought Data Cleaning Benefits

CoT data cleaning provides unique advantages for complex data quality scenarios:

**Decision Transparency**: 
- **Step-by-step reasoning**: Shows how each cleaning decision was made
- **Rule application**: Demonstrates how business rules are applied
- **Error detection**: Makes it easier to spot and fix cleaning logic errors

**Complex Decision Making**: 
This approach excels when cleaning decisions require:
- **Contextual analysis**: Understanding what "unknown" means in an age field
- **Multi-step transformations**: Phone number parsing and reformatting
- **Ambiguity resolution**: Deciding how to handle edge cases

**Quality Assurance**: 
- **Auditable process**: Each step can be verified independently
- **Learning opportunity**: Understanding reasoning helps improve future prompts
- **Debugging support**: When cleaning fails, you can see where the logic broke down

**Business Applications**: 
CoT cleaning is particularly valuable for:
- **Regulated industries**: Where data transformation must be documented
- **High-stakes data**: Where cleaning errors have significant consequences
- **Training scenarios**: Where understanding the process is as important as the result

**Record Selection**: 
We chose record 5 (sarah davis, unknown, developer...) because it contains multiple edge cases that benefit from step-by-step reasoning.

### Batch Processing: Entire DataFrame Consistency



In [17]:
def batch_clean_entire_dataframe(df):
    """
    Clean entire DataFrame in one batch request for consistency
    """
    # Convert DataFrame to numbered list format
    records_text = "\n".join([
        f"{row['record_id']}. {row['raw_record']}" 
        for _, row in df.iterrows()
    ])
    
    messages = [
        {
            "role": "system",
            "content": "You are a data cleaning expert who ensures consistency across all records in a dataset."
        },
        {
            "role": "user",
            "content": f"""Clean this entire dataset while maintaining consistency across all records:

Dataset:
{records_text}

Requirements:
1. Standardize all names to Title Case
2. Convert ages to numbers (use null for unknown/invalid)
3. Standardize job titles consistently (e.g., 'Software Engineer' for all dev roles)
4. Format all emails in lowercase
5. Format all phones as XXX-XXX-XXXX
6. Return as a JSON array with exactly {len(df)} objects

Focus on consistency - if you see similar job titles, standardize them the same way.
Return ONLY the JSON array:"""
        }
    ]
    
    response = call_llama(messages)
    print("Batch Cleaning Response:")
    print(response[:500] + "..." if len(response) > 500 else response)
    
    # Try to parse the JSON array
    try:
        # Find the JSON array in the response
        start_idx = response.find('[')
        end_idx = response.rfind(']') + 1
        
        if start_idx != -1 and end_idx != 0:
            json_str = response[start_idx:end_idx]
            cleaned_data = json.loads(json_str)
            cleaned_df = pd.DataFrame(cleaned_data)
            
            print(f"\nSuccessfully parsed {len(cleaned_df)} records")
            return cleaned_df
        else:
            print("\nNo JSON array found in response")
            return None
            
    except json.JSONDecodeError as e:
        print(f"\nJSON parsing error: {e}")
        return None

# Clean the entire dataset
batch_cleaned_df = batch_clean_entire_dataframe(messy_df)

if batch_cleaned_df is not None:
    print("\nBatch Cleaned DataFrame:")
    print(batch_cleaned_df.to_string(index=False))
    
    # Combine with original data for comparison
    comparison_df = pd.concat([
        messy_df[['record_id', 'raw_record']].reset_index(drop=True),
        batch_cleaned_df.reset_index(drop=True)
    ], axis=1)
    
    print("\nSide-by-side comparison:")
    print(comparison_df.to_string(index=False))

Batch Cleaning Response:
```json
[
  {"name": "John Doe", "age": 25, "jobTitle": "Software Engineer", "email": "john.doe@email.com", "phone": "555-123-4567"},
  {"name": "Jane Smith", "age": null, "jobTitle": "Marketing Manager", "email": "jane@company.co", "phone": "555-987-6543"},
  {"name": "Bob Johnson", "age": null, "jobTitle": "Sales Representative", "email": "bob.johnson@gmail.com", "phone": "555-321-9876"},
  {"name": "Mary Williams", "age": 28, "jobTitle": "Data Scientist", "email": "mary.williams@company.com",...

Successfully parsed 8 records

Batch Cleaned DataFrame:
         name  age             jobTitle                     email        phone
     John Doe 25.0    Software Engineer        john.doe@email.com 555-123-4567
   Jane Smith  NaN    Marketing Manager           jane@company.co 555-987-6543
  Bob Johnson  NaN Sales Representative     bob.johnson@gmail.com 555-321-9876
Mary Williams 28.0       Data Scientist mary.williams@company.com 555-456-7890
   Mike Brown 45.0 


### Batch Processing Advantages and Challenges

Batch processing represents a sophisticated approach to data cleaning:

**Consistency Benefits**: 
- **Cross-record standardization**: The model sees all data simultaneously
- **Pattern recognition**: Can identify and standardize similar job titles across records
- **Global decision making**: Ensures consistent treatment of similar issues

**Example Consistency Rules**: 
- **Job title standardization**: "software engineer" and "developer" → consistent choice
- **Format uniformity**: All phone numbers follow the same pattern
- **Case standardization**: All names use identical capitalization rules

**Technical Challenges**: 
- **JSON array parsing**: More complex than single-record JSON
- **Token limits**: Large datasets may exceed model context windows
- **Error recovery**: One parsing error can affect the entire batch

**Performance Considerations**: 
- **API efficiency**: One call vs. multiple calls (cost and speed)
- **Memory usage**: Large DataFrames require more memory
- **Error handling**: Balance between robustness and processing speed

**When to Use Batch Processing**: 
- **Small to medium datasets**: Under token limits
- **Consistency requirements**: When cross-record standardization matters
- **Cost optimization**: When API call minimization is important

### Data Quality Assessment: Measuring Success



In [18]:
def assess_data_quality(original_df, cleaned_df):
    """
    Perform data quality assessment on cleaned DataFrame
    """
    print("=== DATA QUALITY ASSESSMENT ===")
    
    # Basic statistics
    print(f"\nOriginal records: {len(original_df)}")
    print(f"Cleaned records: {len(cleaned_df) if cleaned_df is not None else 0}")
    
    if cleaned_df is not None:
        # Check for missing values
        print("\nMissing values in cleaned data:")
        missing_counts = cleaned_df.isnull().sum()
        for column, count in missing_counts.items():
            percentage = (count / len(cleaned_df)) * 100
            print(f"  {column}: {count} ({percentage:.1f}%)")
        
        # Data type analysis
        print("\nData types:")
        print(cleaned_df.dtypes)
        
        # Age validation
        if 'age' in cleaned_df.columns:
            valid_ages = cleaned_df['age'].dropna()
            if len(valid_ages) > 0:
                print(f"\nAge statistics:")
                print(f"  Valid ages: {len(valid_ages)} / {len(cleaned_df)}")
                print(f"  Age range: {valid_ages.min()} - {valid_ages.max()}")
                print(f"  Average age: {valid_ages.mean():.1f}")
        
        # Email validation
        if 'email' in cleaned_df.columns:
            emails = cleaned_df['email'].dropna()
            valid_emails = emails[emails.str.contains('@', na=False)]
            print(f"\nEmail validation:")
            print(f"  Valid email format: {len(valid_emails)} / {len(emails)}")
        
        # Phone validation
        if 'phone' in cleaned_df.columns:
            phones = cleaned_df['phone'].dropna()
            # Check for XXX-XXX-XXXX format
            standard_format = phones.str.match(r'^\d{3}-\d{3}-\d{4}$', na=False)
            print(f"\nPhone validation:")
            print(f"  Standard format (XXX-XXX-XXXX): {standard_format.sum()} / {len(phones)}")
        
        # Show sample of cleaned data
        print("\nSample cleaned records:")
        print(cleaned_df.head(3).to_string(index=False))
        
        return cleaned_df
    else:
        print("\nNo cleaned data available for assessment")
        return None

# Assess the quality of our batch cleaned data
quality_result = assess_data_quality(messy_df, batch_cleaned_df)

=== DATA QUALITY ASSESSMENT ===

Original records: 8
Cleaned records: 8

Missing values in cleaned data:
  name: 0 (0.0%)
  age: 3 (37.5%)
  jobTitle: 0 (0.0%)
  email: 0 (0.0%)
  phone: 0 (0.0%)

Data types:
name         object
age         float64
jobTitle     object
email        object
phone        object
dtype: object

Age statistics:
  Valid ages: 5 / 8
  Age range: 25.0 - 45.0
  Average age: 32.4

Email validation:
  Valid email format: 8 / 8

Phone validation:
  Standard format (XXX-XXX-XXXX): 8 / 8

Sample cleaned records:
       name  age             jobTitle                 email        phone
   John Doe 25.0    Software Engineer    john.doe@email.com 555-123-4567
 Jane Smith  NaN    Marketing Manager       jane@company.co 555-987-6543
Bob Johnson  NaN Sales Representative bob.johnson@gmail.com 555-321-9876



### Comprehensive Data Quality Assessment

Quality assessment is crucial for validating prompt engineering effectiveness:

**Multi-Dimensional Quality Metrics**: 

1. **Completeness Assessment**: 
   - **Missing value analysis**: Identifies fields with incomplete data
   - **Percentage calculations**: Quantifies data completeness
   - **Field-specific analysis**: Different fields may have different completeness expectations

2. **Data Type Validation**: 
   - **Type consistency**: Ensures fields contain expected data types
   - **Format compliance**: Validates against business rules
   - **Range validation**: Checks for reasonable value ranges

3. **Domain-Specific Validation**: 
   - **Age reasonableness**: 18-100 range for employee data
   - **Email format validation**: Ensures @ symbol presence
   - **Phone format compliance**: XXX-XXX-XXXX pattern matching

**Business Value of Quality Assessment**: 
- **Confidence building**: Quantifies cleaning effectiveness
- **Process improvement**: Identifies areas needing prompt refinement
- **Compliance support**: Documents data quality for regulatory requirements
- **Decision support**: Provides metrics for choosing between cleaning approaches

**Quality Metrics Interpretation**: 
- **100% email format compliance**: Indicates successful email standardization
- **95%+ phone format compliance**: Shows effective pattern recognition
- **Age statistics**: Validates reasonable value ranges and distributions

### Production Data Cleaning Pipeline



In [19]:
def production_data_cleaning_pipeline(df):
    """
    Production-ready data cleaning pipeline with error handling
    """
    print("=== PRODUCTION DATA CLEANING PIPELINE ===")
    
    # Initialize results DataFrame
    results_df = df.copy()
    
    # Add processing metadata columns
    results_df['processing_status'] = 'pending'
    results_df['processing_timestamp'] = pd.Timestamp.now()
    results_df['cleaning_method'] = 'llm_prompt'
    
    cleaned_records = []
    
    for idx, row in df.iterrows():
        try:
            print(f"Processing record {row['record_id']}...")
            
            # Use role-based cleaning for production
            messages = [
                {
                    "role": "system",
                    "content": "You are a data quality specialist. Clean data according to enterprise standards and return only valid JSON."
                },
                {
                    "role": "user",
                    "content": f"""Clean this record to enterprise standards. Return only JSON with fields: name, age, job_title, email, phone.

Raw: "{row['raw_record']}"

Standards:
- Name: Title Case
- Age: Integer 18-100 or null
- Job Title: Standardized corporate titles
- Email: Valid lowercase format
- Phone: XXX-XXX-XXXX format

JSON only:"""
                }
            ]
            
            response = call_llama(messages)
            parsed_json = parse_cleaned_json(response)
            
            if parsed_json:
                # Add metadata
                parsed_json['source_record_id'] = row['record_id']
                parsed_json['processing_status'] = 'success'
                cleaned_records.append(parsed_json)
                
                # Update status in results
                results_df.loc[idx, 'processing_status'] = 'success'
                
            else:
                # Handle parsing failures
                fallback_record = {
                    'name': None,
                    'age': None,
                    'job_title': None,
                    'email': None,
                    'phone': None,
                    'source_record_id': row['record_id'],
                    'processing_status': 'failed_parsing',
                    'error_message': 'Could not parse LLM response'
                }
                cleaned_records.append(fallback_record)
                results_df.loc[idx, 'processing_status'] = 'failed_parsing'
                
        except Exception as e:
            # Handle API or other errors
            error_record = {
                'name': None,
                'age': None,
                'job_title': None,
                'email': None,
                'phone': None,
                'source_record_id': row['record_id'],
                'processing_status': 'error',
                'error_message': str(e)
            }
            cleaned_records.append(error_record)
            results_df.loc[idx, 'processing_status'] = 'error'
            print(f"Error processing record {row['record_id']}: {e}")
    
    # Create cleaned DataFrame
    cleaned_df = pd.DataFrame(cleaned_records)
    
    # Generate processing report
    print("\n=== PROCESSING REPORT ===")
    status_counts = results_df['processing_status'].value_counts()
    print("Processing status summary:")
    for status, count in status_counts.items():
        percentage = (count / len(results_df)) * 100
        print(f"  {status}: {count} ({percentage:.1f}%)")
    
    # Show successful records
    successful_records = cleaned_df[cleaned_df['processing_status'] == 'success']
    if len(successful_records) > 0:
        print(f"\nSample successful records ({len(successful_records)} total):")
        display_cols = ['name', 'age', 'job_title', 'email', 'phone']
        print(successful_records[display_cols].head(3).to_string(index=False))
    
    # Save results to CSV (in real production, you'd save to database)
    output_file = '/tmp/cleaned_data_results.csv'
    cleaned_df.to_csv(output_file, index=False)
    print(f"\nResults saved to: {output_file}")
    
    return cleaned_df, results_df

# Run production pipeline
production_cleaned_df, production_results_df = production_data_cleaning_pipeline(messy_df.head(4))  # Process first 4 for demo

=== PRODUCTION DATA CLEANING PIPELINE ===
Processing record 1...
Processing record 2...
Processing record 3...
Processing record 4...

=== PROCESSING REPORT ===
Processing status summary:
  success: 4 (100.0%)

Sample successful records (4 total):
       name  age            job_title                 email        phone
   John Doe 25.0    Software Engineer    john.doe@email.com 555-123-4567
 Jane Smith  NaN    Marketing Manager       jane@company.co 555-987-6543
Bob Johnson 32.0 Sales Representative bob.johnson@gmail.com 555-321-9876

Results saved to: /tmp/cleaned_data_results.csv



### Production-Ready Data Processing

This production pipeline demonstrates enterprise-grade data cleaning:

**Comprehensive Error Handling**: 
- **Try-catch blocks**: Prevent individual record failures from stopping the process
- **Multiple error types**: Distinguishes between API errors and parsing failures
- **Graceful degradation**: Continues processing even when some records fail

**Metadata Tracking**: 
- **Processing timestamps**: Records when each operation occurred
- **Status tracking**: Success, failure, parsing error categories
- **Method documentation**: Records which cleaning approach was used
- **Source traceability**: Links cleaned records back to originals

**Business Process Integration**: 
- **Progress reporting**: Shows processing status for operational visibility
- **Quality metrics**: Provides success/failure rates for process monitoring
- **Output persistence**: Saves results for downstream processing
- **Audit trail**: Maintains processing history for compliance

**Enterprise Considerations**: 
- **Scalability**: Framework handles varying dataset sizes
- **Reliability**: Robust error handling prevents data loss
- **Monitoring**: Status tracking enables operational oversight
- **Integration**: CSV output integrates with existing data pipelines

**Role-Based Cleaning**: 
Uses data quality specialist role for:
- **Enterprise standards**: Applies organizational data governance rules
- **Professional expertise**: Leverages domain knowledge for cleaning decisions
- **Consistent methodology**: Ensures uniform approach across all records

### Approach Comparison: Choosing the Right Technique



In [20]:
def compare_cleaning_approaches(df, sample_record_idx=0):
    """
    Compare different prompt engineering approaches on the same record
    """
    target_record = df.iloc[sample_record_idx]
    
    print(f"=== COMPARING APPROACHES FOR RECORD {target_record['record_id']} ===")
    print(f"Raw data: {target_record['raw_record']}")
    
    approaches = {
        'Zero-shot': zero_shot_data_cleaning_single,
        'Role-based': lambda record: call_llama([
            {"role": "system", "content": "You are a data cleaning expert."},
            {"role": "user", "content": f"Clean this record and return only JSON: {record}"}
        ])
    }
    
    results = {}
    
    for approach_name, cleaning_func in approaches.items():
        print(f"\n--- {approach_name} Approach ---")
        try:
            response = cleaning_func(target_record['raw_record'])
            parsed_json = parse_cleaned_json(response)
            
            if parsed_json:
                results[approach_name] = parsed_json
                print("Result:", json.dumps(parsed_json, indent=2))
            else:
                results[approach_name] = {'error': 'Failed to parse'}
                print("Failed to parse response")
                
        except Exception as e:
            results[approach_name] = {'error': str(e)}
            print(f"Error: {e}")
    
    # Create comparison DataFrame
    comparison_data = []
    for approach, result in results.items():
        row = {'approach': approach}
        row.update(result)
        comparison_data.append(row)
    
    comparison_df = pd.DataFrame(comparison_data)
    
    print("\n=== COMPARISON SUMMARY ===")
    print(comparison_df.to_string(index=False))
    
    return comparison_df

# Compare approaches on one record
approach_comparison = compare_cleaning_approaches(messy_df, sample_record_idx=1)

=== COMPARING APPROACHES FOR RECORD 2 ===
Raw data: JANE SMITH,, marketing manager, jane@company.co, (555) 987-6543

--- Zero-shot Approach ---
Result: {
  "name": "Jane Smith",
  "age": null,
  "job_title": "Marketing Manager",
  "email": "jane@company.co",
  "phone": "555-987-6543"
}

--- Role-based Approach ---
Result: {
  "name": "JANE SMITH",
  "email": "jane@company.co",
  "phone": "(555) 987-6543",
  "title": "marketing manager"
}

=== COMPARISON SUMMARY ===
  approach       name  age         job_title           email          phone             title
 Zero-shot Jane Smith  NaN Marketing Manager jane@company.co   555-987-6543               NaN
Role-based JANE SMITH  NaN               NaN jane@company.co (555) 987-6543 marketing manager



### Strategic Approach Selection

This comparison framework helps you choose the optimal prompt engineering technique:

**Approach Evaluation Criteria**: 

1. **Output Quality**: 
   - **Accuracy**: How correctly does each approach clean the data?
   - **Consistency**: How uniform are the results across similar inputs?
   - **Completeness**: How well does each approach handle missing data?

2. **Processing Characteristics**: 
   - **Speed**: How quickly does each approach process data?
   - **Cost**: What are the API usage costs for each method?
   - **Scalability**: How well does each approach handle large datasets?

3. **Business Fit**: 
   - **Transparency needs**: How important is understanding the cleaning process?
   - **Consistency requirements**: How critical is cross-record standardization?
   - **Error tolerance**: How much variation in output is acceptable?

**Decision Framework**: 

- **Use Zero-Shot When**: 
  - Simple, well-defined cleaning rules
  - Quick prototyping or one-off tasks
  - Limited time for prompt development

- **Use Few-Shot When**: 
  - Need consistent output formatting
  - Have specific business rules to enforce
  - Working with domain-specific data

- **Use Chain of Thought When**: 
  - Complex cleaning decisions required
  - Need to understand/debug the cleaning process
  - Working in regulated environments

- **Use Role-Based When**: 
  - Need domain expertise applied
  - Want professional-grade output
  - Have specific organizational standards

- **Use Batch Processing When**: 
  - Cross-record consistency is critical
  - Cost optimization is important
  - Dataset size is manageable

**Hybrid Approaches**: 
In practice, you might combine techniques:
- **Few-shot + Role prompting**: Examples with expert persona
- **CoT + Batch processing**: Reasoning with consistency
- **Zero-shot + Production pipeline**: Simple rules with robust error handling

---

## Workshop Summary and Next Steps

Congratulations! You've completed a comprehensive journey through the essential techniques of prompt engineering. Let's consolidate your learning and chart the path forward.




### Key Takeaways and Strategic Insights

**Fundamental Principles You've Learned**: 

1. **Prompt Engineering is Problem-Solving**: 
   Each technique addresses specific challenges. Understanding when and why to use each approach is more valuable than memorizing prompt formats.

2. **Context and Clarity Drive Success**: 
   The most effective prompts provide clear context, specific instructions, and well-defined expectations. Ambiguity is the enemy of consistent AI performance.

3. **Examples Are Powerful Teachers**: 
   Few-shot prompting often provides the best balance of effort and results. Good examples can transform unreliable outputs into consistent, professional-grade results.

4. **Reasoning Improves Complex Tasks**: 
   Chain of thought prompting breaks down complex problems and makes AI decision-making transparent and verifiable.

5. **Roles Shape Behavior**: 
   Assigning appropriate roles to AI models can dramatically improve response quality by activating relevant knowledge and communication styles.

**Business Applications and ROI**: 

The data cleaning examples demonstrate immediate business value:
- **Time savings**: Automate hours of manual data standardization
- **Cost reduction**: Reduce the need for specialized data cleaning tools or services
- **Quality improvement**: Achieve more consistent results than manual processes
- **Scalability**: Handle larger datasets with the same level of quality

**Implementation Roadmap**: 

1. **Start Simple**: Begin with zero-shot prompting for straightforward tasks
2. **Add Examples**: Develop few-shot prompts for tasks requiring consistency
3. **Incorporate Reasoning**: Use CoT for complex decision-making processes
4. **Develop Roles**: Create specialized personas for domain-specific tasks
5. **Build Pipelines**: Integrate prompt engineering into production workflows

**Advanced Topics for Further Exploration**: 

- **Prompt Chaining**: Connecting multiple prompts for complex workflows
- **Dynamic Prompting**: Adapting prompts based on input characteristics
- **Evaluation Frameworks**: Systematically measuring prompt effectiveness
- **Model Comparison**: Evaluating different models for specific tasks
- **Prompt Optimization**: Iterative improvement of prompt performance

**Building Your Prompt Engineering Practice**: 

1. **Create a Prompt Library**: Build reusable prompts for common tasks
2. **Establish Quality Metrics**: Define success criteria for different use cases
3. **Document Best Practices**: Record what works well in your domain
4. **Iterate and Improve**: Continuously refine prompts based on results
5. **Share and Collaborate**: Build organizational knowledge around effective prompting

**Final Thoughts**: 

Prompt engineering is both an art and a science. While this workshop provides structured techniques and best practices, the most effective prompt engineers develop intuition through experimentation and practice. The key is to start with these proven techniques and adapt them to your specific needs, datasets, and business requirements.

Remember that prompt engineering is rapidly evolving. Stay curious, keep experimenting, and don't hesitate to combine techniques in creative ways. The examples in this workshop are starting points—your unique applications and innovations will drive the most value for your organization.

**Your Next Steps**: 
1. Choose a real dataset from your work and apply these techniques
2. Compare the effectiveness of different approaches on your data
3. Build a simple production pipeline using the patterns demonstrated
4. Share your results and learnings with colleagues
5. Continue exploring advanced prompt engineering techniques