# Parameter Tuning with Amorsize

This notebook demonstrates how to use Amorsize's parameter tuning features to find optimal parallelization parameters through empirical benchmarking.

## Prerequisites

Make sure you've completed **Getting Started** (01_getting_started.ipynb) first!

## What You'll Learn

1. **Grid Search Tuning** - Systematically test parameter combinations
2. **Quick Tuning** - Fast tuning with minimal configurations
3. **Bayesian Optimization** - Intelligent parameter search
4. **Configuration Management** - Save and reuse optimal parameters
5. **Comparison Analysis** - Validate tuning vs optimizer recommendations
6. **Real-World Workflow** - Complete tuning pipeline for production

## When to Use Parameter Tuning

- **Production workloads**: When you need guaranteed optimal performance
- **Repeated execution**: When the function runs many times with similar data
- **Validation**: To verify optimizer recommendations empirically
- **Edge cases**: When optimizer can't sample (e.g., side effects, databases)

**Note**: Tuning requires actual execution, so it takes longer than `optimize()`. Use when the extra time investment pays off!

## Setup

In [None]:
import time
from amorsize import (
    optimize,
    tune_parameters,
    quick_tune,
    bayesian_tune_parameters,
    save_config,
    load_config
)

# For visualization (optional)
try:
    import matplotlib.pyplot as plt
    HAS_MATPLOTLIB = True
except ImportError:
    HAS_MATPLOTLIB = False
    print("Matplotlib not available - skipping visualizations")

print("‚úÖ Setup complete!")

## Part 1: Understanding Grid Search Tuning

Grid search systematically tests every combination of parameters to find the empirically best configuration.

### Example: CPU-Intensive Task

In [None]:
# Define a CPU-intensive function
def cpu_intensive_task(x):
    """Simulate expensive computation."""
    result = 0
    for i in range(1000):
        result += x ** 2
    return result

# Create test data
data = list(range(100, 300))

# Run grid search tuning
print("Running grid search tuning...")
result = tune_parameters(
    cpu_intensive_task,
    data,
    n_jobs_range=[1, 2, 4],          # Test 1, 2, and 4 workers
    chunksize_range=[20, 50, 100],   # Test 3 chunk sizes
    verbose=True
)

print("\n" + "="*70)
print(result)

### Analyzing Tuning Results

In [None]:
# Extract key metrics
print(f"Best configuration found:")
print(f"  n_jobs: {result.best_n_jobs}")
print(f"  chunksize: {result.best_chunksize}")
print(f"  Speedup: {result.best_speedup:.2f}x")
print(f"  Execution time: {result.best_time:.4f}s")
print(f"\nConfigurations tested: {result.configurations_tested}")

# Show top 5 configurations
print("\nTop 5 configurations:")
for i, (n_jobs, chunksize, exec_time, speedup) in enumerate(result.get_top_configurations(5), 1):
    print(f"  {i}. n_jobs={n_jobs:2d}, chunksize={chunksize:4d} -> "
          f"{exec_time:.4f}s ({speedup:.2f}x)")

### Visualizing Results

In [None]:
if HAS_MATPLOTLIB:
    # Create heatmap of execution times
    import numpy as np
    
    # Extract unique n_jobs and chunksizes
    configs = list(result.all_results.keys())
    n_jobs_vals = sorted(set(c[0] for c in configs))
    chunksize_vals = sorted(set(c[1] for c in configs))
    
    # Create matrix of execution times
    times_matrix = np.zeros((len(chunksize_vals), len(n_jobs_vals)))
    for i, cs in enumerate(chunksize_vals):
        for j, nj in enumerate(n_jobs_vals):
            times_matrix[i, j] = result.all_results.get((nj, cs), np.nan)
    
    # Create heatmap
    fig, ax = plt.subplots(figsize=(10, 6))
    im = ax.imshow(times_matrix, aspect='auto', cmap='RdYlGn_r')
    
    # Set ticks and labels
    ax.set_xticks(range(len(n_jobs_vals)))
    ax.set_yticks(range(len(chunksize_vals)))
    ax.set_xticklabels(n_jobs_vals)
    ax.set_yticklabels(chunksize_vals)
    ax.set_xlabel('n_jobs')
    ax.set_ylabel('chunksize')
    ax.set_title('Execution Time by Configuration (seconds)\nDarker = Faster')
    
    # Add colorbar
    cbar = plt.colorbar(im, ax=ax)
    cbar.set_label('Execution Time (s)', rotation=270, labelpad=20)
    
    # Annotate cells with values
    for i in range(len(chunksize_vals)):
        for j in range(len(n_jobs_vals)):
            if not np.isnan(times_matrix[i, j]):
                text = ax.text(j, i, f'{times_matrix[i, j]:.3f}',
                             ha="center", va="center", color="black", fontsize=9)
    
    plt.tight_layout()
    plt.show()
    
    print("‚úÖ Heatmap shows execution time for each configuration")
    print("   Dark green = Fastest, Dark red = Slowest")
else:
    print("Install matplotlib to see visualizations: pip install matplotlib")

## Part 2: Quick Tuning for Rapid Prototyping

When you don't have time for exhaustive search, use `quick_tune()` to test a minimal set of likely-optimal configurations.

In [None]:
print("Running quick tuning (minimal search)...")
quick_result = quick_tune(
    cpu_intensive_task,
    data,
    verbose=True
)

print("\n" + "="*70)
print(quick_result)

# Compare with full grid search
print("\nComparison:")
print(f"  Quick tune: {quick_result.configurations_tested} configs tested")
print(f"  Grid search: {result.configurations_tested} configs tested")
print(f"  Time saved: ~{(result.configurations_tested - quick_result.configurations_tested) * result.best_time:.2f}s")
print(f"  Performance difference: {abs(quick_result.best_speedup - result.best_speedup):.2f}x")

## Part 3: Bayesian Optimization for Intelligent Search

Bayesian optimization uses machine learning to intelligently explore the parameter space, finding good configurations with fewer trials than grid search.

**Note**: Requires `scikit-optimize`. Install with: `pip install scikit-optimize`

In [None]:
try:
    print("Running Bayesian optimization...")
    bayesian_result = bayesian_tune_parameters(
        cpu_intensive_task,
        data,
        n_iterations=15,  # Test 15 intelligently-chosen configurations
        verbose=True,
        random_state=42  # For reproducibility
    )
    
    print("\n" + "="*70)
    print(bayesian_result)
    
    # Compare efficiency
    print("\nEfficiency Comparison:")
    print(f"  Bayesian: {bayesian_result.configurations_tested} configs tested")
    print(f"  Grid search: {result.configurations_tested} configs tested")
    print(f"  Bayesian speedup found: {bayesian_result.best_speedup:.2f}x")
    print(f"  Grid search speedup: {result.best_speedup:.2f}x")
    print(f"  Efficiency: {bayesian_result.best_speedup / bayesian_result.configurations_tested:.3f}x per config")
    
except ImportError:
    print("‚ö†Ô∏è  scikit-optimize not installed. Skipping Bayesian optimization demo.")
    print("   Install with: pip install scikit-optimize")
    print("   Falls back to grid search automatically.")

## Part 4: Comparing Tuning with Optimizer Recommendations

The optimizer (`optimize()`) makes predictions without running your function. Let's see how tuning compares!

In [None]:
# Get optimizer recommendation
print("Getting optimizer recommendation...")
opt_result = optimize(cpu_intensive_task, data, verbose=False)

print("\nOptimizer Recommendation:")
print(f"  n_jobs: {opt_result.n_jobs}")
print(f"  chunksize: {opt_result.chunksize}")
print(f"  Predicted speedup: {opt_result.estimated_speedup:.2f}x")

print("\nTuning Results:")
print(f"  n_jobs: {result.best_n_jobs}")
print(f"  chunksize: {result.best_chunksize}")
print(f"  Actual speedup: {result.best_speedup:.2f}x")

# Check if they match
if opt_result.n_jobs == result.best_n_jobs and opt_result.chunksize == result.best_chunksize:
    print("\n‚úÖ Optimizer recommendation matches tuning results!")
    print("   The optimizer is highly accurate for this workload.")
else:
    print("\n‚ö†Ô∏è  Optimizer recommendation differs from tuning.")
    print("   This can happen when:")
    print("   - Workload has unusual characteristics")
    print("   - System has special constraints")
    print("   - Function has side effects")
    speedup_diff = abs(opt_result.estimated_speedup - result.best_speedup)
    if speedup_diff < 0.5:
        print(f"   Prediction error is small ({speedup_diff:.2f}x), both are good choices.")
    else:
        print(f"   Significant difference ({speedup_diff:.2f}x), use tuning results.")

## Part 5: Configuration Management

Save optimal parameters for reuse in production, avoiding repeated tuning.

In [None]:
import tempfile
import os

# Create a temporary config file
config_path = os.path.join(tempfile.gettempdir(), 'amorsize_tuned_config.json')

# Save the best configuration
print("Saving optimal configuration...")
result.save_config(
    config_path,
    function_name='cpu_intensive_task',
    notes='Optimal config found via grid search tuning',
    overwrite=True
)
print(f"‚úÖ Configuration saved to: {config_path}")

# Load and use the configuration
print("\nLoading configuration...")
config = load_config(config_path)
print(f"Loaded config: n_jobs={config.n_jobs}, chunksize={config.chunksize}")

# Use in production
from multiprocessing import Pool

def process_with_saved_config(func, data, config_path):
    """Production pattern: Load config and execute."""
    config = load_config(config_path)
    
    with Pool(config.n_jobs) as pool:
        results = pool.map(func, data, chunksize=config.chunksize)
    
    return results

print("\n‚úÖ Configuration ready for production use!")
print("   Load once at startup, reuse for all executions.")

# Clean up
if os.path.exists(config_path):
    os.remove(config_path)

## Part 6: Advanced Tuning Patterns

### Pattern 1: Tuning for Different Workload Sizes

In [None]:
# Test how optimal parameters change with data size
print("Testing parameter scaling across workload sizes...\n")

workload_sizes = [50, 100, 200, 500]
tuning_results = {}

for size in workload_sizes:
    test_data = list(range(100, 100 + size))
    tune_result = quick_tune(cpu_intensive_task, test_data, verbose=False)
    tuning_results[size] = tune_result
    
    print(f"Size {size:4d}: n_jobs={tune_result.best_n_jobs}, "
          f"chunksize={tune_result.best_chunksize:4d}, "
          f"speedup={tune_result.best_speedup:.2f}x")

print("\nüí° Insight: Larger workloads often benefit from more workers and larger chunks.")

### Pattern 2: Tuning I/O-Bound Tasks with Threads

In [None]:
# Define I/O-bound function
def io_bound_task(x):
    """Simulate I/O wait (network, disk, etc.)."""
    time.sleep(0.01)  # 10ms wait
    return x * 2

io_data = list(range(50))

print("Tuning I/O-bound task with threads...")
io_result = quick_tune(
    io_bound_task,
    io_data,
    prefer_threads_for_io=True,  # Use threads instead of processes
    verbose=True
)

print("\n" + "="*70)
print(io_result)
print("\nüí° Insight: I/O-bound tasks typically benefit from more workers (threads).")

## Part 7: Complete Production Workflow

Here's a complete example of a production tuning workflow:

In [None]:
def production_tuning_workflow(func, data, config_dir='/tmp'):
    """
    Complete tuning workflow for production deployment.
    
    Steps:
    1. Quick tune to get initial parameters
    2. Validate with optimizer
    3. Fine-tune if needed
    4. Save configuration
    5. Return ready-to-use config
    """
    print("=" * 70)
    print("PRODUCTION TUNING WORKFLOW")
    print("=" * 70)
    
    # Step 1: Quick tune
    print("\n[1/5] Running quick tune...")
    quick_result = quick_tune(func, data, verbose=False)
    print(f"      Quick tune found: n_jobs={quick_result.best_n_jobs}, "
          f"chunksize={quick_result.best_chunksize} ({quick_result.best_speedup:.2f}x)")
    
    # Step 2: Get optimizer recommendation
    print("\n[2/5] Getting optimizer recommendation...")
    opt_result = optimize(func, data, verbose=False)
    print(f"      Optimizer suggests: n_jobs={opt_result.n_jobs}, "
          f"chunksize={opt_result.chunksize} ({opt_result.estimated_speedup:.2f}x predicted)")
    
    # Step 3: Decide if fine-tuning needed
    print("\n[3/5] Validating results...")
    configs_match = (quick_result.best_n_jobs == opt_result.n_jobs and 
                    quick_result.best_chunksize == opt_result.chunksize)
    
    if configs_match:
        print("      ‚úÖ Quick tune matches optimizer - high confidence")
        final_result = quick_result
    else:
        print("      ‚ö†Ô∏è  Differences detected - running fine-tuning...")
        # Fine-tune around both suggestions
        n_jobs_range = sorted(set([quick_result.best_n_jobs, opt_result.n_jobs]))
        chunksize_range = sorted(set([quick_result.best_chunksize, opt_result.chunksize]))
        
        final_result = tune_parameters(
            func, data,
            n_jobs_range=n_jobs_range,
            chunksize_range=chunksize_range,
            verbose=False
        )
        print(f"      Fine-tuning complete: {final_result.best_speedup:.2f}x")
    
    # Step 4: Save configuration
    print("\n[4/5] Saving configuration...")
    config_path = os.path.join(config_dir, 'production_config.json')
    final_result.save_config(
        config_path,
        function_name=func.__name__,
        notes='Production-tuned configuration',
        overwrite=True
    )
    print(f"      Saved to: {config_path}")
    
    # Step 5: Summary
    print("\n[5/5] Summary")
    print(f"      Final configuration: n_jobs={final_result.best_n_jobs}, "
          f"chunksize={final_result.best_chunksize}")
    print(f"      Expected speedup: {final_result.best_speedup:.2f}x")
    print(f"      Configurations tested: {final_result.configurations_tested}")
    print(f"\n‚úÖ Ready for production!")
    
    return final_result, config_path

# Run the workflow
final_result, config_path = production_tuning_workflow(cpu_intensive_task, data)

# Clean up
if os.path.exists(config_path):
    os.remove(config_path)

## Part 8: Performance Visualization

Compare speedup across different configurations visually.

In [None]:
if HAS_MATPLOTLIB:
    # Get top configurations
    top_configs = result.get_top_configurations(10)
    
    # Extract data for plotting
    config_labels = [f"n={nj}, c={cs}" for nj, cs, _, _ in top_configs]
    speedups = [speedup for _, _, _, speedup in top_configs]
    
    # Create bar chart
    fig, ax = plt.subplots(figsize=(12, 6))
    bars = ax.bar(range(len(config_labels)), speedups, color='steelblue', alpha=0.7)
    
    # Highlight best configuration
    bars[0].set_color('green')
    bars[0].set_alpha(0.9)
    
    ax.set_xlabel('Configuration (n_jobs, chunksize)')
    ax.set_ylabel('Speedup (x)')
    ax.set_title('Top 10 Configurations by Speedup')
    ax.set_xticks(range(len(config_labels)))
    ax.set_xticklabels(config_labels, rotation=45, ha='right')
    ax.axhline(y=1.0, color='red', linestyle='--', alpha=0.5, label='Serial baseline')
    ax.legend()
    ax.grid(axis='y', alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    print("‚úÖ Chart shows speedup for top 10 configurations")
    print("   Green bar = Best configuration")
else:
    print("Install matplotlib to see visualizations: pip install matplotlib")

## Key Takeaways

### When to Use Each Approach

1. **`optimize()`** - Default choice
   - Fast (no actual execution)
   - Good for most workloads
   - Use when: Quick decisions needed, workload is typical

2. **`quick_tune()`** - Rapid validation
   - Minimal configurations tested (~3-5)
   - Good balance of speed and accuracy
   - Use when: Need empirical validation but time is limited

3. **`tune_parameters()`** - Thorough search
   - Exhaustive grid search
   - Guaranteed to find best in search space
   - Use when: Production workload, need confidence

4. **`bayesian_tune_parameters()`** - Intelligent search
   - ML-guided exploration
   - Efficient for large search spaces
   - Use when: Many parameters to tune, expensive benchmarks

### Best Practices

1. **Start with optimizer**: Use `optimize()` for quick decisions
2. **Validate if critical**: Use `quick_tune()` to verify optimizer recommendations
3. **Save configurations**: Store tuned parameters for reuse
4. **Re-tune periodically**: When data characteristics change significantly
5. **Consider cost**: Tuning takes time - use when repeated execution justifies it

### Common Patterns

```python
# Development: Quick optimization
result = optimize(func, data)

# Pre-production: Validate with quick tune
result = quick_tune(func, data)

# Production: Full tuning + save config
result = tune_parameters(func, data, ...)
result.save_config('production.json')

# Production runtime: Load saved config
config = load_config('production.json')
with Pool(config.n_jobs) as pool:
    results = pool.map(func, data, chunksize=config.chunksize)
```

## Next Steps

- **Use Case Notebooks**: Apply tuning to specific domains (web, data, ML)
- **Advanced Features**: Explore hooks, monitoring, checkpointing
- **Production Deployment**: Integrate with your workflow

## Need Help?

- **Documentation**: See `docs/` directory
- **Examples**: Check `examples/` for more patterns
- **Issues**: Report bugs on GitHub

Happy tuning! üöÄ