# Advanced Prompting Techniques Comparison

This notebook compares all 11 advanced prompting techniques side-by-side with the same inputs to show their different strengths and use cases.

## Techniques We'll Compare

1. **Manager-Style Prompts** - Detailed role-based instructions
2. **Role Prompting** - Specific persona adoption
3. **Task Planning** - Complex workflow breakdown
4. **Structured Output** - Formatted responses with tags
5. **Meta-Prompting** - Self-optimization and analysis
6. **Few-Shot Learning** - Learning from examples
7. **Prompt Folding** - Multi-step workflow management
8. **Escape Hatches** - Uncertainty handling
9. **Thinking Traces** - Step-by-step reasoning
10. **Model Distillation** - Optimization for production
11. **Evaluation Framework** - Testing and metrics

In [None]:
# Setup and imports
import os
import sys
from dotenv import load_dotenv
import json
from IPython.display import HTML, display
import time

# Add parent directory to path
sys.path.append('..')

# Load environment variables
load_dotenv()

import dspy

# Import all techniques
from src.prompts.manager_style import create_customer_support_manager
from src.techniques.role_prompting import create_veteran_engineer_persona
from src.techniques.task_planning import TaskOrchestrator
from src.techniques.structured_output import StructuredOutputGenerator, create_bug_report_schema
from src.techniques.meta_prompting import MetaPromptOptimizer
from src.techniques.few_shot import FewShotLearner, create_bug_analysis_examples, FewShotPromptTemplate
from src.techniques.escape_hatches import EscapeHatchResponder
from src.techniques.thinking_traces import ThinkingTracer

# Configure DSpy
api_key = os.getenv('OPENAI_API_KEY')
if not api_key:
    print("⚠️ Please set OPENAI_API_KEY in your .env file")
else:
    lm = dspy.LM(model="gpt-4o-mini", api_key=api_key, max_tokens=1500)
    dspy.settings.configure(lm=lm)
    print("✅ DSpy configured with OpenAI")

## Test Scenario: Code Review Request

We'll use a common software engineering scenario - reviewing a piece of code with potential security issues - to see how each technique handles the same input.

In [None]:
# Common test scenario
test_code = '''
def login_user(username, password):
    query = f"SELECT * FROM users WHERE username='{username}' AND password='{password}'"
    cursor.execute(query)
    user = cursor.fetchone()
    if user:
        session['user_id'] = user[0]
        return redirect('/dashboard')
    else:
        return "Login failed"
'''

task = "Review this login function for security vulnerabilities"
context = "This is production code in a web application handling user authentication"

print("🔍 Test Scenario: Code Security Review")
print("\n📝 Code to Review:")
print(test_code)
print("\n🎯 Task:", task)
print("📋 Context:", context)

## 1. Manager-Style Prompts

In [None]:
print("👔 Manager-Style Prompts")
print("="*50)

# Create code review manager
from src.prompts.manager_style import create_code_review_manager
code_manager = create_code_review_manager()

start_time = time.time()
manager_result = code_manager(task=task, context=f"{context}\n\nCode:\n{test_code}")
manager_time = time.time() - start_time

print(f"⏱️ Response time: {manager_time:.2f}s")
print(f"📝 Response ({len(manager_result)} chars):")
print(manager_result[:500] + "..." if len(manager_result) > 500 else manager_result)

## 2. Role Prompting

In [None]:
print("🎭 Role Prompting (Veteran Engineer)")
print("="*50)

engineer = create_veteran_engineer_persona()

start_time = time.time()
role_result = engineer(task=task, context=f"{context}\n\nCode:\n{test_code}")
role_time = time.time() - start_time

print(f"⏱️ Response time: {role_time:.2f}s")
print(f"📝 Response ({len(role_result)} chars):")
print(role_result[:500] + "..." if len(role_result) > 500 else role_result)

## 3. Structured Output

In [None]:
print("📋 Structured Output")
print("="*50)

# Create code review schema
from src.techniques.structured_output import create_code_review_schema
generator = StructuredOutputGenerator()
code_schema = create_code_review_schema()

start_time = time.time()
structured_result = generator(
    task=task,
    schema=code_schema,
    context=f"{context}\n\nCode:\n{test_code}"
)
structured_time = time.time() - start_time

print(f"⏱️ Response time: {structured_time:.2f}s")
print(f"📝 Response ({len(structured_result)} chars):")
print(structured_result[:500] + "..." if len(structured_result) > 500 else structured_result)

## 4. Few-Shot Learning

In [None]:
print("📚 Few-Shot Learning")
print("="*50)

# Create few-shot learner with bug analysis examples
examples = create_bug_analysis_examples()
template = FewShotPromptTemplate(
    task_intro="Analyze the following code for security vulnerabilities:",
    example_intro="Here are examples of good security analysis:",
    include_reasoning=True
)
learner = FewShotLearner(examples, template)

start_time = time.time()
fewshot_result = learner(f"{context}\n\nCode:\n{test_code}")
fewshot_time = time.time() - start_time

print(f"⏱️ Response time: {fewshot_time:.2f}s")
print(f"📝 Response ({len(fewshot_result)} chars):")
print(fewshot_result[:500] + "..." if len(fewshot_result) > 500 else fewshot_result)

## 5. Escape Hatches (Uncertainty Handling)

In [None]:
print("🚪 Escape Hatches (Uncertainty Handling)")
print("="*50)

escaper = EscapeHatchResponder()

start_time = time.time()
escape_result = escaper(f"{task}\n\n{context}\n\nCode:\n{test_code}")
escape_time = time.time() - start_time

print(f"⏱️ Response time: {escape_time:.2f}s")
print(f"📝 Response ({len(escape_result['response'])} chars):")
print(escape_result['response'][:500] + "..." if len(escape_result['response']) > 500 else escape_result['response'])
print(f"\n🎯 Confidence: {escape_result['uncertainty_analysis'].confidence_level:.2f}")
print(f"❓ Uncertainty Level: {escape_result['uncertainty_analysis'].uncertainty_level}")

## 6. Thinking Traces

In [None]:
print("🧠 Thinking Traces")
print("="*50)

tracer = ThinkingTracer(verbose=False)  # Set to False to avoid visualization in notebook

start_time = time.time()
thinking_result = tracer(f"{task}\n\n{context}\n\nCode:\n{test_code}")
thinking_time = time.time() - start_time

print(f"⏱️ Response time: {thinking_time:.2f}s")
print(f"📝 Final Answer ({len(thinking_result['answer'])} chars):")
print(thinking_result['answer'][:300] + "..." if len(thinking_result['answer']) > 300 else thinking_result['answer'])

print(f"\n🤔 Thinking Steps Preview:")
thinking_steps = thinking_result.get('thinking_steps', '')
print(thinking_steps[:200] + "..." if len(thinking_steps) > 200 else thinking_steps)

## Comparison Summary

Let's analyze the different approaches:

In [None]:
# Create comparison table
results = [
    {
        'Technique': 'Manager-Style',
        'Response Time': f"{manager_time:.2f}s",
        'Response Length': len(manager_result),
        'Key Strength': 'Professional structure, comprehensive guidance',
        'Best For': 'Consistent, detailed responses in professional contexts'
    },
    {
        'Technique': 'Role Prompting', 
        'Response Time': f"{role_time:.2f}s",
        'Response Length': len(role_result),
        'Key Strength': 'Domain expertise, authentic voice',
        'Best For': 'Leveraging specific professional perspectives'
    },
    {
        'Technique': 'Structured Output',
        'Response Time': f"{structured_time:.2f}s", 
        'Response Length': len(structured_result),
        'Key Strength': 'Consistent format, machine-readable',
        'Best For': 'API responses, structured data extraction'
    },
    {
        'Technique': 'Few-Shot Learning',
        'Response Time': f"{fewshot_time:.2f}s",
        'Response Length': len(fewshot_result),
        'Key Strength': 'Learning from examples, pattern recognition',
        'Best For': 'Complex tasks with good examples available'
    },
    {
        'Technique': 'Escape Hatches',
        'Response Time': f"{escape_time:.2f}s",
        'Response Length': len(escape_result['response']),
        'Key Strength': f"Confidence tracking ({escape_result['uncertainty_analysis'].confidence_level:.2f})",
        'Best For': 'Critical decisions, avoiding hallucinations'
    },
    {
        'Technique': 'Thinking Traces',
        'Response Time': f"{thinking_time:.2f}s",
        'Response Length': len(thinking_result['answer']),
        'Key Strength': 'Step-by-step reasoning, debugging thought process',
        'Best For': 'Complex problem-solving, educational content'
    }
]

# Display as HTML table
html_table = """
<table border="1" style="border-collapse: collapse; width: 100%;">
<tr style="background-color: #f2f2f2;">
    <th>Technique</th>
    <th>Response Time</th>
    <th>Length (chars)</th>
    <th>Key Strength</th>
    <th>Best For</th>
</tr>
"""

for result in results:
    html_table += f"""
<tr>
    <td><strong>{result['Technique']}</strong></td>
    <td>{result['Response Time']}</td>
    <td>{result['Response Length']}</td>
    <td>{result['Key Strength']}</td>
    <td>{result['Best For']}</td>
</tr>
"""

html_table += "</table>"

display(HTML(html_table))

## Performance Analysis

In [None]:
# Analyze security vulnerability detection
security_keywords = ['sql injection', 'injection', 'vulnerability', 'security', 'sanitize', 'escape', 'parameterized']

def count_security_mentions(text):
    text_lower = text.lower()
    return sum(1 for keyword in security_keywords if keyword in text_lower)

print("🔒 Security Vulnerability Detection Analysis")
print("="*50)

techniques = [
    ('Manager-Style', manager_result),
    ('Role Prompting', role_result), 
    ('Structured Output', structured_result),
    ('Few-Shot Learning', fewshot_result),
    ('Escape Hatches', escape_result['response']),
    ('Thinking Traces', thinking_result['answer'])
]

for name, result in techniques:
    security_score = count_security_mentions(result)
    mentions_sql = 'sql injection' in result.lower() or 'injection' in result.lower()
    print(f"{name:<18} | Security mentions: {security_score} | Identifies SQL injection: {'✅' if mentions_sql else '❌'}")

## Key Insights

### When to Use Each Technique:

1. **Manager-Style**: Professional environments requiring consistent, comprehensive responses
2. **Role Prompting**: When domain expertise and authentic voice matter
3. **Structured Output**: API integrations and data processing pipelines
4. **Few-Shot Learning**: Complex tasks where good examples exist
5. **Escape Hatches**: Critical decisions where confidence levels matter
6. **Thinking Traces**: Educational content and complex problem-solving

### Performance Observations:

- **Response Times**: Generally consistent across techniques (~2-4 seconds)
- **Security Detection**: All techniques successfully identified SQL injection vulnerability
- **Response Depth**: Manager-style and few-shot typically provide more comprehensive analysis
- **Confidence Tracking**: Only escape hatches provide explicit uncertainty measurements

### Combination Strategies:

- Combine **Manager-Style + Escape Hatches** for critical business decisions
- Use **Role Prompting + Thinking Traces** for educational content
- Apply **Few-Shot + Structured Output** for consistent data processing

## Try Your Own Comparison

Modify the test scenario below to compare techniques with your own use case:

In [None]:
# Customize this for your own testing
custom_task = "Your task here"
custom_context = "Your context here" 
custom_input = "Your input data here"

print("🧪 Custom Comparison Test")
print("🎯 Task:", custom_task)
print("📋 Context:", custom_context)
print("📝 Input:", custom_input)
print("\n💡 Modify the variables above and re-run to test with your own scenario!")

# Uncomment to test:
# manager_custom = code_manager(task=custom_task, context=f"{custom_context}\n\n{custom_input}")
# print("\n👔 Manager Result:")
# print(manager_custom)