# Lab 1: Introduction to Optimization

## Learning Objectives

By the end of this lab, you will:
- Understand what optimization means in AI
- Formulate optimization problems
- Identify objective functions and constraints
- Distinguish between local and global optima
- Visualize optimization landscapes
- Implement basic optimization strategies

## What is Optimization?

**Optimization** is the process of finding the best solution from all feasible solutions.

**General form**:
```
maximize/minimize f(x)
subject to: g_i(x) ≤ 0  (inequality constraints)
            h_j(x) = 0  (equality constraints)
            x ∈ X       (domain constraints)
```

Where:
- **f(x)** = objective function (what we want to optimize)
- **x** = decision variables (what we control)
- **g_i, h_j** = constraints (what we must satisfy)

## Why Optimization in AI?

Optimization is **everywhere** in AI:
- 🎯 **Training ML models**: Minimize loss function
- 🗺️ **Path planning**: Minimize distance/time
- 📅 **Scheduling**: Minimize conflicts/costs
- 🎮 **Game AI**: Maximize score/win probability
- 🤖 **Robot control**: Minimize energy/time
- 📊 **Resource allocation**: Maximize utility

## Real-World Applications

- 🚗 **Uber/Lyft**: Route optimization, driver assignment
- 📦 **Amazon**: Warehouse layout, delivery routes
- ✈️ **Airlines**: Crew scheduling, flight routing
- 🏭 **Manufacturing**: Production planning, supply chains
- 💰 **Finance**: Portfolio optimization, trading strategies


In [None]:
# Import libraries
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from typing import Callable, List, Tuple, Optional
import time

# Set random seed
np.random.seed(42)

# Plot settings
plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams['figure.figsize'] = (12, 6)

## Part 1: Formulating Optimization Problems

### Example 1: The Knapsack Problem

**Problem**: You have a knapsack with limited capacity. Choose items to maximize value.

**Formulation**:
- **Decision variables**: x_i ∈ {0, 1} (take item i or not)
- **Objective**: maximize Σ v_i × x_i (total value)
- **Constraint**: Σ w_i × x_i ≤ W (weight limit)


In [None]:
class KnapsackProblem:
    """The 0-1 Knapsack Problem."""
    
    def __init__(self, values: List[float], weights: List[float], capacity: float):
        """
        Initialize knapsack problem.
        
        Args:
            values: Value of each item
            weights: Weight of each item
            capacity: Maximum weight capacity
        """
        self.values = np.array(values)
        self.weights = np.array(weights)
        self.capacity = capacity
        self.n_items = len(values)
    
    def is_feasible(self, solution: np.ndarray) -> bool:
        """Check if solution satisfies constraints."""
        total_weight = np.sum(solution * self.weights)
        return total_weight <= self.capacity
    
    def objective(self, solution: np.ndarray) -> float:
        """Calculate objective function value."""
        if not self.is_feasible(solution):
            return -np.inf  # Infeasible solution
        return np.sum(solution * self.values)
    
    def greedy_solution(self) -> np.ndarray:
        """Simple greedy heuristic: best value-to-weight ratio."""
        # Calculate value per unit weight
        ratios = self.values / self.weights
        
        # Sort items by ratio (descending)
        sorted_indices = np.argsort(-ratios)
        
        solution = np.zeros(self.n_items, dtype=int)
        remaining_capacity = self.capacity
        
        for idx in sorted_indices:
            if self.weights[idx] <= remaining_capacity:
                solution[idx] = 1
                remaining_capacity -= self.weights[idx]
        
        return solution
    
    def __repr__(self):
        return f"KnapsackProblem({self.n_items} items, capacity={self.capacity})"


# Example knapsack problem
print("Knapsack Problem Example")
print("=" * 60)

values = [60, 100, 120, 80, 90]
weights = [10, 20, 30, 15, 25]
capacity = 50

knapsack = KnapsackProblem(values, weights, capacity)

print(f"Items: {knapsack.n_items}")
print(f"Capacity: {capacity}")
print()
print("Item | Value | Weight | Ratio")
print("-" * 40)
for i, (v, w) in enumerate(zip(values, weights)):
    ratio = v / w
    print(f"  {i}  |  {v:3d}  |   {w:2d}   | {ratio:.2f}")

# Greedy solution
greedy_sol = knapsack.greedy_solution()
greedy_value = knapsack.objective(greedy_sol)
greedy_weight = np.sum(greedy_sol * knapsack.weights)

print()
print("Greedy Solution:")
print(f"Selected items: {np.where(greedy_sol == 1)[0].tolist()}")
print(f"Total value: {greedy_value}")
print(f"Total weight: {greedy_weight}/{capacity}")

## Part 2: Objective Functions and Landscapes

The **objective function** defines what "best" means.

### Visualization: Optimization Landscapes

Let's visualize different types of optimization landscapes:


In [None]:
def create_landscapes():
    """Create different optimization landscape examples."""
    
    x = np.linspace(-5, 5, 1000)
    
    # 1. Convex (easy - single global optimum)
    y1 = x**2
    
    # 2. Non-convex with local minima
    y2 = x**4 - 4*x**3 + 3*x**2 + 2*x
    
    # 3. Multimodal (many local optima)
    y3 = x * np.sin(3*x) + 0.5*x**2
    
    # 4. Noisy (real-world)
    y4 = x**2 + 0.5*np.random.randn(len(x))
    
    fig, axes = plt.subplots(2, 2, figsize=(14, 10))
    
    landscapes = [
        (y1, "Convex (Easy)", "Single global minimum"),
        (y2, "Non-Convex", "Multiple local minima"),
        (y3, "Multimodal", "Many local optima"),
        (y4, "Noisy", "Real-world complexity")
    ]
    
    for ax, (y, title, subtitle) in zip(axes.flat, landscapes):
        ax.plot(x, y, linewidth=2, color='blue')
        
        # Mark global minimum
        min_idx = np.argmin(y)
        ax.plot(x[min_idx], y[min_idx], 'r*', markersize=20, 
               label='Global minimum', zorder=5)
        
        ax.set_xlabel('x', fontweight='bold')
        ax.set_ylabel('f(x)', fontweight='bold')
        ax.set_title(f'{title}\n{subtitle}', fontweight='bold')
        ax.legend()
        ax.grid(alpha=0.3)
    
    plt.tight_layout()
    plt.show()

print("Optimization Landscape Types")
print("=" * 60)
create_landscapes()

print("\nKey Insights:")
print("- Convex: Gradient descent works perfectly")
print("- Non-convex: May get stuck in local minima")
print("- Multimodal: Need global search strategies")
print("- Noisy: Need robust optimization methods")

### 3D Optimization Landscape

Let's visualize a 2D optimization problem:


In [None]:
def rastrigin(x, y):
    """Rastrigin function - multimodal test function."""
    A = 10
    return (A * 2 + (x**2 - A * np.cos(2 * np.pi * x)) + 
           (y**2 - A * np.cos(2 * np.pi * y)))

def ackley(x, y):
    """Ackley function - another multimodal test function."""
    return (-20 * np.exp(-0.2 * np.sqrt(0.5 * (x**2 + y**2))) - 
           np.exp(0.5 * (np.cos(2 * np.pi * x) + np.cos(2 * np.pi * y))) + 
           np.e + 20)

# Create meshgrid
x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)

# Calculate function values
Z_rastrigin = rastrigin(X, Y)
Z_ackley = ackley(X, Y)

# 3D surface plots
fig = plt.figure(figsize=(16, 6))

# Rastrigin
ax1 = fig.add_subplot(121, projection='3d')
surf1 = ax1.plot_surface(X, Y, Z_rastrigin, cmap='viridis', alpha=0.8)
ax1.set_xlabel('x', fontweight='bold')
ax1.set_ylabel('y', fontweight='bold')
ax1.set_zlabel('f(x, y)', fontweight='bold')
ax1.set_title('Rastrigin Function\n(Many local minima)', fontweight='bold')
fig.colorbar(surf1, ax=ax1, shrink=0.5)

# Ackley
ax2 = fig.add_subplot(122, projection='3d')
surf2 = ax2.plot_surface(X, Y, Z_ackley, cmap='plasma', alpha=0.8)
ax2.set_xlabel('x', fontweight='bold')
ax2.set_ylabel('y', fontweight='bold')
ax2.set_zlabel('f(x, y)', fontweight='bold')
ax2.set_title('Ackley Function\n(Complex landscape)', fontweight='bold')
fig.colorbar(surf2, ax=ax2, shrink=0.5)

plt.tight_layout()
plt.show()

print("These are challenging optimization problems!")
print("Global minimum is at (0, 0) but there are many local minima.")

## Part 3: Local vs Global Optima

### Definitions

**Local Optimum**: Best solution in a neighborhood
- f(x*) ≤ f(x) for all x in neighborhood of x*

**Global Optimum**: Best solution overall
- f(x*) ≤ f(x) for all x in the entire domain

**Challenge**: Many optimization algorithms can get stuck in local optima!


In [None]:
def demonstrate_local_vs_global():
    """Demonstrate the difference between local and global optima."""
    
    # Create a function with multiple local minima
    x = np.linspace(0, 10, 1000)
    y = np.sin(x) * x + 0.1 * x**2
    
    # Find local minima using simple approach
    local_minima = []
    for i in range(1, len(y) - 1):
        if y[i] < y[i-1] and y[i] < y[i+1]:
            # Check if it's a significant local minimum
            if i % 50 == 0:  # Sample to avoid too many points
                local_minima.append((x[i], y[i]))
    
    # Global minimum
    global_min_idx = np.argmin(y)
    global_min = (x[global_min_idx], y[global_min_idx])
    
    # Plot
    plt.figure(figsize=(14, 6))
    plt.plot(x, y, 'b-', linewidth=2, label='Objective function')
    
    # Mark local minima
    for xm, ym in local_minima:
        plt.plot(xm, ym, 'yo', markersize=12, markeredgecolor='orange',
                markeredgewidth=2, label='Local minimum' if xm == local_minima[0][0] else '')
    
    # Mark global minimum
    plt.plot(global_min[0], global_min[1], 'r*', markersize=25,
            markeredgecolor='darkred', markeredgewidth=2, label='Global minimum')
    
    plt.xlabel('x', fontsize=12, fontweight='bold')
    plt.ylabel('f(x)', fontsize=12, fontweight='bold')
    plt.title('Local vs Global Optima', fontsize=14, fontweight='bold')
    plt.legend(fontsize=11)
    plt.grid(alpha=0.3)
    plt.tight_layout()
    plt.show()
    
    print(f"Global minimum: f({global_min[0]:.2f}) = {global_min[1]:.2f}")
    print(f"Found {len(local_minima)} local minima")
    print()
    print("Problem: If we start at the wrong place, we might find a local")
    print("minimum instead of the global minimum!")

demonstrate_local_vs_global()

## Part 4: Random Search Baseline

Before implementing sophisticated algorithms, let's establish a baseline:
**Random Search** - Simply try random solutions!


In [None]:
class RandomSearch:
    """Simple random search optimizer."""
    
    def __init__(self, objective_fn: Callable, bounds: List[Tuple[float, float]]):
        """
        Initialize random search.
        
        Args:
            objective_fn: Function to minimize
            bounds: List of (min, max) for each dimension
        """
        self.objective_fn = objective_fn
        self.bounds = bounds
        self.n_dims = len(bounds)
    
    def optimize(self, n_iterations: int) -> Tuple[np.ndarray, float, List[float]]:
        """
        Run random search.
        
        Returns:
            (best_solution, best_value, history)
        """
        best_solution = None
        best_value = np.inf
        history = []
        
        for _ in range(n_iterations):
            # Generate random solution
            solution = np.array([np.random.uniform(low, high) 
                               for low, high in self.bounds])
            
            # Evaluate
            value = self.objective_fn(solution)
            
            # Update best
            if value < best_value:
                best_value = value
                best_solution = solution.copy()
            
            history.append(best_value)
        
        return best_solution, best_value, history


# Test on sphere function (simple convex)
def sphere(x):
    """Sphere function: simple convex function."""
    return np.sum(x**2)

print("Random Search Example")
print("=" * 60)
print("Minimizing sphere function: f(x) = sum(x_i^2)")
print("Global minimum: f(0, 0) = 0")
print()

# 2D sphere function
bounds = [(-5, 5), (-5, 5)]
rs = RandomSearch(sphere, bounds)

# Run random search
best_sol, best_val, history = rs.optimize(n_iterations=1000)

print(f"Best solution: ({best_sol[0]:.4f}, {best_sol[1]:.4f})")
print(f"Best value: {best_val:.6f}")
print()

# Plot convergence
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(history, linewidth=2)
plt.xlabel('Iteration', fontweight='bold')
plt.ylabel('Best Value Found', fontweight='bold')
plt.title('Random Search Convergence', fontweight='bold')
plt.yscale('log')
plt.grid(alpha=0.3)

plt.subplot(1, 2, 2)
# Visualize search space
x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)
Z = X**2 + Y**2

plt.contourf(X, Y, Z, levels=20, cmap='viridis', alpha=0.6)
plt.colorbar(label='f(x, y)')
plt.plot(best_sol[0], best_sol[1], 'r*', markersize=20, 
        markeredgecolor='white', markeredgewidth=2, label='Best found')
plt.plot(0, 0, 'go', markersize=15, markeredgecolor='white', 
        markeredgewidth=2, label='True optimum')
plt.xlabel('x', fontweight='bold')
plt.ylabel('y', fontweight='bold')
plt.title('Solution in Search Space', fontweight='bold')
plt.legend()
plt.grid(alpha=0.3)

plt.tight_layout()
plt.show()

## Part 5: Optimization Problem Types

### Classification of Optimization Problems


In [None]:
print("Classification of Optimization Problems")
print("=" * 60)

classification = {
    "By Variables": {
        "Continuous": "Variables can take any real value (e.g., robot position)",
        "Discrete": "Variables from finite set (e.g., scheduling decisions)",
        "Mixed": "Combination of continuous and discrete"
    },
    "By Objective": {
        "Single-objective": "One objective to optimize",
        "Multi-objective": "Multiple conflicting objectives (Pareto optimization)"
    },
    "By Constraints": {
        "Unconstrained": "No constraints on variables",
        "Constrained": "Must satisfy constraints"
    },
    "By Convexity": {
        "Convex": "Easy - local optimum is global",
        "Non-convex": "Hard - many local optima"
    },
    "By Determinism": {
        "Deterministic": "Objective function is deterministic",
        "Stochastic": "Objective has randomness/noise"
    }
}

for category, types in classification.items():
    print(f"\n{category}:")
    for type_name, description in types.items():
        print(f"  • {type_name}: {description}")

## Part 6: Benchmark Functions

Standard test functions for comparing optimization algorithms:


In [None]:
class BenchmarkFunctions:
    """Common optimization benchmark functions."""
    
    @staticmethod
    def sphere(x):
        """Sphere: f(x) = sum(x_i^2). Global min at origin."""
        return np.sum(x**2)
    
    @staticmethod
    def rosenbrock(x):
        """Rosenbrock: narrow valley to global optimum."""
        return np.sum(100 * (x[1:] - x[:-1]**2)**2 + (1 - x[:-1])**2)
    
    @staticmethod
    def rastrigin(x):
        """Rastrigin: highly multimodal."""
        A = 10
        n = len(x)
        return A * n + np.sum(x**2 - A * np.cos(2 * np.pi * x))
    
    @staticmethod
    def ackley(x):
        """Ackley: many local minima."""
        n = len(x)
        return (-20 * np.exp(-0.2 * np.sqrt(np.sum(x**2) / n)) -
               np.exp(np.sum(np.cos(2 * np.pi * x)) / n) + 20 + np.e)
    
    @staticmethod
    def griewank(x):
        """Griewank: many widespread local minima."""
        sum_term = np.sum(x**2) / 4000
        prod_term = np.prod(np.cos(x / np.sqrt(np.arange(1, len(x) + 1))))
        return sum_term - prod_term + 1


# Test all benchmark functions
print("Benchmark Function Comparison")
print("=" * 60)

bench = BenchmarkFunctions()
functions = [
    ('Sphere', bench.sphere, 'Convex, unimodal'),
    ('Rosenbrock', bench.rosenbrock, 'Non-convex valley'),
    ('Rastrigin', bench.rastrigin, 'Highly multimodal'),
    ('Ackley', bench.ackley, 'Many local minima'),
    ('Griewank', bench.griewank, 'Multimodal')
]

# Evaluate at different points
test_points = [
    np.array([0.0, 0.0]),
    np.array([1.0, 1.0]),
    np.array([2.0, -2.0])
]

print("\nFunction values at test points:")
print("Function      | (0,0)  | (1,1)  | (2,-2)  | Type")
print("-" * 65)

for name, func, func_type in functions:
    values = [func(pt) for pt in test_points]
    print(f"{name:13s} | {values[0]:6.2f} | {values[1]:6.2f} | {values[2]:7.2f} | {func_type}")

print("\nAll functions have global minimum at or near origin.")

## Part 7: Comparing Random Search on Benchmarks

Let's compare random search performance on different landscapes:


In [None]:
print("Random Search Performance on Benchmark Functions")
print("=" * 60)

bench = BenchmarkFunctions()
bounds = [(-5, 5), (-5, 5)]
n_iterations = 1000

functions = [
    ('Sphere', bench.sphere),
    ('Rosenbrock', bench.rosenbrock),
    ('Rastrigin', bench.rastrigin),
    ('Ackley', bench.ackley)
]

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

for ax, (name, func) in zip(axes.flat, functions):
    # Run random search
    rs = RandomSearch(func, bounds)
    best_sol, best_val, history = rs.optimize(n_iterations)
    
    # Plot convergence
    ax.plot(history, linewidth=2, color='blue')
    ax.set_xlabel('Iteration', fontweight='bold')
    ax.set_ylabel('Best Value', fontweight='bold')
    ax.set_title(f'{name} Function\nFinal: {best_val:.6f}', fontweight='bold')
    ax.set_yscale('log')
    ax.grid(alpha=0.3)

plt.tight_layout()
plt.show()

print("\nObservations:")
print("- Sphere: Fast convergence (easy problem)")
print("- Rosenbrock: Slower (narrow valley)")
print("- Rastrigin: Difficult (many local minima)")
print("- Ackley: Challenging (complex landscape)")
print("\nRandom search is not efficient - we need smarter algorithms!")

## Exercises

### Exercise 1: Custom Objective Function
Create your own 2D objective function with at least 2 local minima.
Visualize it and test random search on it.

In [None]:
# TODO: Create and test custom objective function
# Your code here
pass

### Exercise 2: Knapsack Variants
Implement and compare different greedy strategies for the knapsack problem:
- Sort by value
- Sort by weight
- Sort by value/weight ratio

In [None]:
# TODO: Compare greedy strategies
# Your code here
pass

### Exercise 3: Random Search Analysis
Run random search 10 times on the same problem with different random seeds.
Plot the distribution of final values and analyze variance.

In [None]:
# TODO: Analyze random search variance
# Your code here
pass

## Summary

### Key Takeaways

1. **Optimization** - Finding the best solution among many possibilities
2. **Objective Function** - Defines what "best" means
3. **Constraints** - What solutions are feasible
4. **Local vs Global** - Getting stuck vs finding the true optimum
5. **Problem Types** - Different problems need different algorithms
6. **Benchmarks** - Standard functions to test algorithms

### Why This Matters

- **Universal**: Almost every AI problem involves optimization
- **Practical**: Real-world problems are often optimization problems
- **Foundation**: Understanding needed for ML, deep learning, RL

### Next Steps

In Lab 2, we'll learn **Local Search Algorithms** that are much smarter than random search:
- Hill Climbing
- Simulated Annealing
- And more!
