# Advanced Prompt Optimization

This notebook explores advanced features of promptomatix including:
- Different optimization backends
- Various task types
- Custom synthetic data generation
- Advanced configurations

## Prerequisites

Make sure you have completed the basic usage notebook first.

---

## Setup and Imports

In [None]:
import os
import sys
import json
from dotenv import load_dotenv

# Add the src directory to Python path
sys.path.append('../src')

# Import promptomatix functions
from promptomatix.main import process_input

# Load environment variables
load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')

print("✅ Setup complete")

## 1. Different Optimization Backends

promptomatix supports multiple optimization backends. Let's compare them.

In [None]:
# Test different backends with the same input
test_input = "Summarize a long text in 2-3 sentences"

backends = ["simple_meta_prompt", "dspy"]
results = {}

for backend in backends:
    print(f"\n🔄 Testing {backend} backend...")
    
    config = {
        "raw_input": test_input,
        "model_name": "gpt-3.5-turbo",
        "model_api_key": api_key,
        "model_provider": "openai",
        "backend": backend,
        "synthetic_data_size": 3,
        "task_type": "summarization"
    }
    
    try:
        result = process_input(**config)
        results[backend] = result
        
        print(f"✅ {backend} completed")
        print(f"📝 Optimized prompt:")
        print(result['result'])
        print(f"💰 Cost: ${result['metrics']['cost']:.4f}")
        print(f"⏱️  Time: {result['metrics']['time_taken']:.2f}s")
        
    except Exception as e:
        print(f"❌ {backend} failed: {str(e)}")
        results[backend] = None

## 2. Different Task Types

Let's explore various task types that promptomatix can handle.

In [None]:
# Define different task types and their inputs
task_examples = {
    "classification": {
        "input": "Classify emails as spam or not spam",
        "description": "Binary or multi-class classification tasks"
    },
    "summarization": {
        "input": "Summarize a research paper in 200 words",
        "description": "Text summarization tasks"
    },
    "translation": {
        "input": "Translate English text to Spanish",
        "description": "Language translation tasks"
    },
    "qa": {
        "input": "Answer questions about a given text",
        "description": "Question-answering tasks"
    },
    "generation": {
        "input": "Generate creative writing based on a prompt",
        "description": "Text generation tasks"
    }
}

task_results = {}

for task_type, task_info in task_examples.items():
    print(f"\n🔄 Testing {task_type} task...")
    print(f"Description: {task_info['description']}")
    
    config = {
        "raw_input": task_info['input'],
        "model_name": "gpt-3.5-turbo",
        "model_api_key": api_key,
        "model_provider": "openai",
        "backend": "simple_meta_prompt",
        "synthetic_data_size": 2,
        "task_type": task_type
    }
    
    try:
        result = process_input(**config)
        task_results[task_type] = result
        
        print(f"✅ {task_type} completed")
        print(f"📝 Optimized prompt:")
        print(result['result'])
        print(f"📊 Synthetic data count: {len(result['synthetic_data'])}")
        
    except Exception as e:
        print(f"❌ {task_type} failed: {str(e)}")
        task_results[task_type] = None

## 3. Custom Synthetic Data Generation

Let's explore how synthetic data is generated and how we can control it.

In [None]:
# Test different synthetic data sizes
sizes = [1, 5, 10]
size_results = {}

for size in sizes:
    print(f"\n🔄 Testing with {size} synthetic examples...")
    
    config = {
        "raw_input": "Extract key information from customer reviews",
        "model_name": "gpt-3.5-turbo",
        "model_api_key": api_key,
        "model_provider": "openai",
        "backend": "simple_meta_prompt",
        "synthetic_data_size": size,
        "task_type": "extraction"
    }
    
    try:
        result = process_input(**config)
        size_results[size] = result
        
        print(f"✅ Generated {len(result['synthetic_data'])} examples")
        print(f"💰 Cost: ${result['metrics']['cost']:.4f}")
        print(f"⏱️  Time: {result['metrics']['time_taken']:.2f}s")
        
        # Show a sample of synthetic data
        if result['synthetic_data']:
            print("📊 Sample synthetic data:")
            for i, example in enumerate(result['synthetic_data'][:2], 1):
                print(f"  {i}. {example}")
        
    except Exception as e:
        print(f"❌ Failed: {str(e)}")
        size_results[size] = None

## 4. Advanced Configurations

Let's explore advanced configuration options.

In [None]:
# Test with custom model parameters
print("🔄 Testing with custom model parameters...")

advanced_config = {
    "raw_input": "Generate a professional email response",
    "model_name": "gpt-3.5-turbo",
    "model_api_key": api_key,
    "model_provider": "openai",
    "backend": "simple_meta_prompt",
    "synthetic_data_size": 3,
    "task_type": "generation",
    "temperature": 0.3,  # Lower temperature for more focused output
    "max_tokens": 2000,  # Custom max tokens
    "train_ratio": 0.8   # 80% for training, 20% for validation
}

try:
    advanced_result = process_input(**advanced_config)
    
    print("✅ Advanced configuration completed")
    print(f"📝 Optimized prompt:")
    print(advanced_result['result'])
    print(f"💰 Cost: ${advanced_result['metrics']['cost']:.4f}")
    print(f"⏱️  Time: {advanced_result['metrics']['time_taken']:.2f}s")
    
except Exception as e:
    print(f"❌ Advanced configuration failed: {str(e)}")

## 5. Comparison and Analysis

Let's compare the results from different configurations.

In [None]:
# Compare backend performance
print("📊 Backend Performance Comparison:")
print("-" * 60)
print(f"{'Backend':<20} {'Cost':<10} {'Time':<10} {'Status':<10}")
print("-" * 60)

for backend, result in results.items():
    if result:
        cost = result['metrics']['cost']
        time = result['metrics']['time_taken']
        status = "✅ Success"
    else:
        cost = time = 0
        status = "❌ Failed"
    
    print(f"{backend:<20} ${cost:<9.4f} {time:<9.2f}s {status:<10}")

print("\n📊 Task Type Performance:")
print("-" * 60)
print(f"{'Task Type':<15} {'Cost':<10} {'Time':<10} {'Status':<10}")
print("-" * 60)

for task_type, result in task_results.items():
    if result:
        cost = result['metrics']['cost']
        time = result['metrics']['time_taken']
        status = "✅ Success"
    else:
        cost = time = 0
        status = "❌ Failed"
    
    print(f"{task_type:<15} ${cost:<9.4f} {time:<9.2f}s {status:<10}")

## 6. Best Practices and Tips

Based on our experiments, here are some best practices:

In [None]:
print("💡 Best Practices for Advanced Prompt Optimization:")
print("\n1. Backend Selection:")
print("   - Use 'simple_meta_prompt' for most tasks (faster, cheaper)")
print("   - Use 'dspy' for complex reasoning tasks")

print("\n2. Task Type Selection:")
print("   - Choose the most specific task type for better results")
print("   - Use 'qa' for question-answering")
print("   - Use 'classification' for categorization tasks")
print("   - Use 'summarization' for text compression")

print("\n3. Synthetic Data Size:")
print("   - Start with 3-5 examples for testing")
print("   - Use 10+ examples for production")
print("   - Balance between cost and quality")

print("\n4. Model Parameters:")
print("   - Lower temperature (0.1-0.3) for consistent results")
print("   - Higher temperature (0.7-0.9) for creative tasks")
print("   - Adjust max_tokens based on expected output length")

print("\n5. Cost Optimization:")
print("   - Use smaller models for testing (gpt-3.5-turbo)")
print("   - Use larger models for final optimization (gpt-4)")
print("   - Monitor costs with the metrics provided")

## Summary

In this notebook, we explored:

✅ **Different Backends**: Compared simple_meta_prompt vs dspy
✅ **Task Types**: Tested classification, summarization, translation, QA, and generation
✅ **Synthetic Data**: Explored different data generation sizes
✅ **Advanced Config**: Custom model parameters and training ratios
✅ **Performance Analysis**: Cost and time comparisons
✅ **Best Practices**: Guidelines for optimal usage

### Key Insights:

- **Backend Choice**: `simple_meta_prompt` is generally faster and cheaper
- **Task Specificity**: More specific task types yield better results
- **Data Size**: 3-5 examples are good for testing, 10+ for production
- **Cost Management**: Monitor costs and adjust parameters accordingly

### Next Steps:

- Explore metrics and evaluation in `03_metrics_evaluation.ipynb`
- Learn advanced features in `04_advanced_features.ipynb`
- Try batch processing with the scripts

---

**Ready for more?** Check out the metrics evaluation notebook to understand how to measure prompt quality!