# Block 4: From Data to Decisions - Optimization Modeling

**Python Module for Incoming ISE & OR PhD Students**  
Instructor: Will Kirschenman | August 7, 2025 | 1:00 PM - 1:50 PM

---

## Welcome Back to Block 4! 🎯

Welcome back from lunch! In the previous blocks, we've journeyed from Python basics to data wrangling to predictive modeling. Now we're taking the final step: **from data to decisions**. 

In Block 3, we learned to predict "what might happen" using machine learning. But as Industrial & Systems Engineers and Operations Researchers, our ultimate goal is to determine "**what should happen**" - that's the power of **optimization**.

By the end of this block, you'll be able to:
- Understand the fundamentals of Linear Programming and Mixed-Integer Linear Programming
- Model real-world decision problems using Pyomo
- Solve optimization problems with different solvers
- Interpret solutions and make data-driven decisions
- Appreciate the difference between open-source and commercial solvers

**Our Mission**: We'll help incoming PhD students like yourselves optimize course selections for maximum learning value while respecting time conflicts and prerequisites. Then we'll explore a fun campus tour problem that showcases solver performance differences!

### The Optimization Mindset 🧠

**Prediction** (Block 3): "Based on your coffee consumption and Hunt Library hours, you'll likely publish 3.2 papers"  
**Optimization** (Block 4): "To maximize research output, you should take these courses, attend these conferences, and schedule advisor meetings this way"

**This is the difference between insight and action!**

## Complete Setup: Package Installation, Solver Setup & Import Libraries

The cell below handles everything you need for Block 4:
- ✅ **Package installation**: Core packages (numpy, pandas, matplotlib, pyomo, plotly)
- ✅ **Solver installation**: HiGHS, CBC, and other optimization solvers
- ✅ **Library imports**: All necessary imports for optimization modeling
- ✅ **Solver detection**: Automatically finds available solvers on your system

Run this one cell and you'll be ready for all optimization examples!

In [None]:
# 📦 Complete Package Installation & Setup for Block 4
# Run this cell ONLY if you encounter import errors
# Most packages are pre-installed in Google Colab

import sys
import subprocess
import time
from itertools import combinations

def install_package(package_name):
    """Install a package using pip if not already installed"""
    try:
        __import__(package_name.split('==')[0])
        print(f"✅ {package_name} already installed")
    except ImportError:
        print(f"📦 Installing {package_name}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", package_name])
        print(f"✅ {package_name} installed successfully")

def install_package_safe(package_name):
    """Install a package with error handling for problematic packages"""
    try:
        __import__(package_name.split('==')[0])
        print(f"✅ {package_name} already installed")
    except ImportError:
        try:
            print(f"📦 Installing {package_name}...")
            subprocess.check_call([sys.executable, "-m", "pip", "install", package_name])
            print(f"✅ {package_name} installed successfully")
        except Exception as e:
            print(f"⚠️  {package_name} installation failed: {e}")
            print(f"💡 Don't worry - the notebook will still work without {package_name}")

# Core packages used in this notebook (Block 4: Optimization)
print("🔍 Checking required packages for Block 4...")
print("=" * 45)

core_packages = [
    'numpy',
    'pandas', 
    'matplotlib',
    'pyomo',
    'plotly'
]

for package in core_packages:
    install_package(package)

# Install optimization solvers
print("\n🔧 Installing optimization solvers...")
print("=" * 35)

# Try HiGHS first (works on all platforms)
install_package_safe('highspy')

# Optionally, try CBC via conda if available
try:
    subprocess.check_call(["conda", "install", "-c", "conda-forge", "coincbc", "-y"], 
                         stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
    print("✅ CBC solver installed via conda")
except:
    print("⚠️  CBC installation via conda failed (this is normal in most environments)")

print("\n🚀 Importing libraries and checking solver availability...")
print("=" * 55)

# Import all necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Pyomo for optimization modeling
import pyomo.environ as pyo
from pyomo.opt import SolverStatus, TerminationCondition, SolverFactory

print(f"📊 NumPy version: {np.__version__}")
print(f"📊 Pandas version: {pd.__version__}")
print(f"🎯 Pyomo imported and ready!")

# Check available solvers
solvers_to_check = ['highs', 'cbc', 'glpk', 'gurobi', 'cplex']
available_solvers = []

print(f"\n🔍 Checking solver availability:")
for solver_name in solvers_to_check:
    try:
        solver = SolverFactory(solver_name)
        if solver.available():
            available_solvers.append(solver_name)
            print(f"   ✅ {solver_name.upper()} - Available")
        else:
            print(f"   ❌ {solver_name.upper()} - Not available")
    except:
        print(f"   ❌ {solver_name.upper()} - Not found")

if available_solvers:
    print(f"\n🎉 Ready to optimize with: {', '.join(available_solvers).upper()}")
    primary_solver = available_solvers[0]
    print(f"💡 We'll use {primary_solver.upper()} as our primary solver")
else:
    print(f"\n⚠️  No solvers detected - trying basic installation...")
    # Fallback: try to install a basic solver
    try:
        subprocess.check_call([sys.executable, "-m", "pip", "install", "coinor-cbc"])
        print("✅ Installed CBC as fallback solver")
        available_solvers = ['cbc']
        primary_solver = 'cbc'
    except:
        print("❌ Could not install fallback solver")
        primary_solver = None

print("\n🎉 All packages ready! You can now run all cells without import errors.")
print("💡 Tip: In Google Colab, most packages are pre-installed, so you likely won't need to install anything!")
print("💡 Note: If solver installations failed, we'll use fallback methods that work reliably.")

---

# Part 1: Optimization Fundamentals 🧮

## What Is Optimization?

Optimization is about making the **best** decision given constraints and objectives. It's everywhere in OR/ISE:

- **Manufacturing**: Minimize costs while meeting demand
- **Transportation**: Find shortest routes while respecting capacity
- **Scheduling**: Maximize productivity while balancing workloads
- **Portfolio**: Maximize returns while limiting risk
- **PhD Life**: Maximize learning while minimizing stress 😅

### The Three Essential Components

Every optimization problem has exactly three components:

1. **🎯 Decision Variables**: What can we control or decide?
2. **📊 Objective Function**: What do we want to optimize (maximize or minimize)?
3. **⚖️ Constraints**: What limitations must we respect?

Let's see this with a simple example that every PhD student can relate to...

## Example: PhD Student Time Allocation

**Scenario**: You have 10 hours to spend between studying and sleeping. How should you allocate your time to maximize your "life satisfaction"?

**Mathematical Formulation**:
- **Decision Variables**: 
  - x₁ = hours spent studying
  - x₂ = hours spent sleeping
- **Objective Function**: Maximize satisfaction = 2x₁ + 3x₂
  - (Each study hour gives 2 satisfaction points, each sleep hour gives 3)
- **Constraints**:
  - x₁ + x₂ ≤ 10 (time limit)
  - x₁ ≥ 3 (minimum study requirement)
  - x₂ ≥ 4 (minimum sleep for survival)
  - x₁, x₂ ≥ 0 (non-negativity)

Let's visualize this graphically!

In [None]:
# Visualize the PhD student time allocation problem
import matplotlib.pyplot as plt
import numpy as np

# Create a grid of points
x1 = np.linspace(0, 12, 400)
x2 = np.linspace(0, 12, 400)
X1, X2 = np.meshgrid(x1, x2)

# Define constraints
constraint1 = X1 + X2 <= 10  # Time limit
constraint2 = X1 >= 3        # Minimum study
constraint3 = X2 >= 4        # Minimum sleep
constraint4 = X1 >= 0        # Non-negativity
constraint5 = X2 >= 0        # Non-negativity

# Feasible region (all constraints satisfied)
feasible = constraint1 & constraint2 & constraint3 & constraint4 & constraint5

# Create the plot
plt.figure(figsize=(12, 8))

# Plot constraints
plt.contour(X1, X2, X1 + X2, levels=[10], colors='red', linestyles='-', linewidths=2, alpha=0.8)
plt.axvline(x=3, color='blue', linestyle='--', linewidth=2, alpha=0.8)
plt.axhline(y=4, color='green', linestyle='--', linewidth=2, alpha=0.8)

# Fill feasible region
plt.contourf(X1, X2, feasible.astype(int), levels=[0.5, 1.5], colors=['lightblue'], alpha=0.5)

# Plot objective function contours (iso-profit lines)
objective_levels = [12, 18, 24, 30]
CS = plt.contour(X1, X2, 2*X1 + 3*X2, levels=objective_levels, colors='purple', alpha=0.6)
plt.clabel(CS, inline=True, fontsize=10, fmt='Satisfaction = %d')

# Mark the optimal solution
optimal_x1, optimal_x2 = 3, 7  # We'll calculate this properly later
plt.plot(optimal_x1, optimal_x2, 'ro', markersize=12, label=f'Optimal Solution: ({optimal_x1}, {optimal_x2})')

# Mark corner points of feasible region
corner_points = [(3, 4), (3, 7), (6, 4)]
for point in corner_points:
    plt.plot(point[0], point[1], 'ko', markersize=8)
    plt.annotate(f'({point[0]}, {point[1]})', point, xytext=(5, 5), textcoords='offset points')

# Formatting
plt.xlim(0, 11)
plt.ylim(0, 11)
plt.xlabel('Study Hours (x₁)', fontsize=12)
plt.ylabel('Sleep Hours (x₂)', fontsize=12)
plt.title('PhD Student Time Allocation Optimization\n(Linear Programming Visualization)', fontsize=14)
plt.grid(True, alpha=0.3)
plt.legend(loc='upper right')

# Add constraint labels
plt.text(8, 2.5, 'x₁ + x₂ ≤ 10', fontsize=11, color='red', weight='bold')
plt.text(1, 7, 'x₁ ≥ 3', fontsize=11, color='blue', weight='bold', rotation=90)
plt.text(7, 3.5, 'x₂ ≥ 4', fontsize=11, color='green', weight='bold')
plt.text(2, 9, 'Feasible\nRegion', fontsize=12, weight='bold', ha='center', 
         bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.7))

plt.tight_layout()
plt.show()

# Calculate the objective value at corner points
print("🔍 Evaluating Corner Points:")
print("=" * 40)
for point in corner_points:
    objective_value = 2 * point[0] + 3 * point[1]
    print(f"Point ({point[0]}, {point[1]}): Satisfaction = 2×{point[0]} + 3×{point[1]} = {objective_value}")

optimal_point = max(corner_points, key=lambda p: 2 * p[0] + 3 * p[1])
optimal_value = 2 * optimal_point[0] + 3 * optimal_point[1]
print(f"\n🎯 Optimal Solution: Study {optimal_point[0]} hours, Sleep {optimal_point[1]} hours")
print(f"🏆 Maximum Satisfaction: {optimal_value} points")
print(f"\n💡 Key Insight: The optimal solution is always at a corner point of the feasible region!")

---

# Part 2: Introduction to Pyomo 🐍

## Why Pyomo?

Pyomo is Python's premier optimization modeling language. It's **solver-agnostic**, meaning you can:
- Write your model once
- Solve with different solvers (CBC, Gurobi, CPLEX, etc.)
- Switch between solvers without changing your code

This is crucial for research where you might need different solvers for different problem types!

## Core Pyomo Components

1. **Model**: The container for your optimization problem
2. **Variables**: Decision variables (what you're optimizing)
3. **Constraints**: Limitations on your variables
4. **Objective**: What you want to maximize or minimize

Let's start with a simple example every PhD student understands: **coffee optimization**!

## Example: PhD Student Coffee Budget Optimization

**Scenario**: You have $20/week for coffee. You can buy:
- Hunt Library coffee ($3/cup, 2 energy units)
- Starbucks coffee ($5/cup, 3 energy units)
- Home brew ($1/cup, 1 energy unit)

**Goal**: Maximize energy while staying within budget and drinking at least 10 cups total (PhD survival requirement).

Let's model this in Pyomo!

In [None]:
# PhD Student Coffee Budget Optimization with Pyomo
import pyomo.environ as pyo

# Step 1: Create the model
coffee_model = pyo.ConcreteModel()

print("☕ Setting up the PhD Coffee Optimization Problem...")
print("=" * 50)

# Step 2: Define decision variables
coffee_model.hunt_cups = pyo.Var(within=pyo.NonNegativeIntegers)  # Hunt Library coffee
coffee_model.starbucks_cups = pyo.Var(within=pyo.NonNegativeIntegers)  # Starbucks coffee
coffee_model.home_cups = pyo.Var(within=pyo.NonNegativeIntegers)  # Home brew

print("✅ Decision Variables defined:")
print("   - hunt_cups: Cups of Hunt Library coffee")
print("   - starbucks_cups: Cups of Starbucks coffee")
print("   - home_cups: Cups of home brew")

# Step 3: Define the objective function (maximize energy)
coffee_model.energy = pyo.Objective(
    expr=2*coffee_model.hunt_cups + 3*coffee_model.starbucks_cups + 1*coffee_model.home_cups,
    sense=pyo.maximize
)

print("\n🎯 Objective: Maximize total energy")
print("   Energy = 2×hunt_cups + 3×starbucks_cups + 1×home_cups")

# Step 4: Define constraints
# Budget constraint: $20 total
coffee_model.budget_constraint = pyo.Constraint(
    expr=3*coffee_model.hunt_cups + 5*coffee_model.starbucks_cups + 1*coffee_model.home_cups <= 20
)

# Minimum cups constraint: At least 10 cups total (PhD survival)
coffee_model.minimum_cups = pyo.Constraint(
    expr=coffee_model.hunt_cups + coffee_model.starbucks_cups + coffee_model.home_cups >= 10
)

# Coffee balance constraint: At least 30% should be "real" coffee (not home brew)
coffee_model.coffee_quality = pyo.Constraint(
    expr=coffee_model.hunt_cups + coffee_model.starbucks_cups >= 0.3 * (coffee_model.hunt_cups + coffee_model.starbucks_cups + coffee_model.home_cups)
)

print("\n⚖️ Constraints defined:")
print("   1. Budget: 3×hunt + 5×starbucks + 1×home ≤ $20")
print("   2. Survival: hunt + starbucks + home ≥ 10 cups")
print("   3. Quality: hunt + starbucks ≥ 30% of total cups")

# Step 5: Solve the model
print("\n🔍 Solving the coffee optimization problem...")

# Use the best available solver
if available_solvers:
    solver = pyo.SolverFactory(available_solvers[0])
    result = solver.solve(coffee_model)
    
    if result.solver.termination_condition == pyo.TerminationCondition.optimal:
        print("\n🎉 OPTIMAL SOLUTION FOUND!")
        print("=" * 40)
        print(f"Hunt Library coffee: {pyo.value(coffee_model.hunt_cups):.0f} cups")
        print(f"Starbucks coffee: {pyo.value(coffee_model.starbucks_cups):.0f} cups")
        print(f"Home brew: {pyo.value(coffee_model.home_cups):.0f} cups")
        print(f"\nTotal energy: {pyo.value(coffee_model.energy):.0f} units")
        
        # Calculate total cost and cups
        total_cost = (3*pyo.value(coffee_model.hunt_cups) + 
                     5*pyo.value(coffee_model.starbucks_cups) + 
                     1*pyo.value(coffee_model.home_cups))
        total_cups = (pyo.value(coffee_model.hunt_cups) + 
                     pyo.value(coffee_model.starbucks_cups) + 
                     pyo.value(coffee_model.home_cups))
        
        print(f"Total cost: ${total_cost:.2f} (Budget: $20)")
        print(f"Total cups: {total_cups:.0f}")
        print(f"\n💡 Insight: Starbucks gives the most energy per dollar!")
        
    else:
        print("❌ No optimal solution found")
        print(f"Solver status: {result.solver.termination_condition}")
else:
    print("❌ No solvers available. Please install a solver first.")

### 🤔 Mini Exercise: Modify the Coffee Model

Try changing the parameters and see how the solution changes:

1. **Change the budget** from $20 to $15 - what happens?
2. **Change Starbucks energy** from 3 to 2 - is it still optimal?
3. **Add a constraint** that you can drink at most 2 Starbucks cups per week

**This is the power of optimization modeling - you can easily explore different scenarios!**

---

# Part 3: Real-World Application - NC State Course Scheduling 📚

## The Challenge

You're an incoming PhD student in ISE/OR at NC State. You need to select courses for your first semester that:
- Maximize your learning value
- Respect time conflicts (can't be in two places at once!)
- Meet credit hour requirements
- Balance different methodology areas
- Consider prerequisites

This is a **Mixed-Integer Linear Program (MILP)** because:
- **Mixed**: Some variables are continuous (GPA impact), others are integers (course selection)
- **Integer**: You either take a course (1) or you don't (0)
- **Linear**: All constraints and objectives are linear functions

Let's build this step by step!

In [None]:
# NC State Course Scheduling Optimization
import pandas as pd
import numpy as np

# Step 1: Create the course catalog data
print("📚 Creating NC State ISE/OR Course Catalog...")
print("=" * 50)

# Course data: [Course_ID, Course_Name, Credits, Learning_Value, Time_Slot, Area, Prerequisites]
courses_data = [
    ["IE511", "Linear Programming", 3, 9, "MWF_9", "Optimization", []],
    ["IE512", "Integer Programming", 3, 8, "TTh_130", "Optimization", ["IE511"]],
    ["IE513", "Stochastic Programming", 3, 7, "MWF_11", "Optimization", ["IE511"]],
    ["IE521", "Statistics for Engineers", 3, 8, "TTh_300", "Statistics", []],
    ["IE522", "Applied Regression", 3, 7, "MWF_9", "Statistics", ["IE521"]],
    ["IE531", "Simulation", 3, 8, "TTh_130", "Simulation", ["IE521"]],
    ["IE532", "Advanced Simulation", 3, 6, "MWF_11", "Simulation", ["IE531"]],
    ["IE541", "Operations Research", 3, 9, "TTh_300", "OR", []],
    ["IE542", "Network Optimization", 3, 7, "MWF_9", "OR", ["IE541"]],
    ["IE551", "Quality Engineering", 3, 6, "TTh_130", "Quality", ["IE521"]],
    ["IE561", "Production Systems", 3, 7, "MWF_11", "Manufacturing", []],
    ["IE571", "Human Factors", 3, 5, "TTh_300", "Human Factors", []],
    ["IE581", "Supply Chain", 3, 8, "MWF_9", "Supply Chain", ["IE541"]],
    ["IE591", "Research Methods", 3, 9, "TTh_130", "Research", []],
    ["IE598", "Special Topics", 3, 6, "MWF_11", "Special", []]
]

# Create DataFrame
courses_df = pd.DataFrame(courses_data, columns=[
    'Course_ID', 'Course_Name', 'Credits', 'Learning_Value', 'Time_Slot', 'Area', 'Prerequisites'
])

print("📋 Available Courses:")
print(courses_df[['Course_ID', 'Course_Name', 'Credits', 'Learning_Value', 'Time_Slot', 'Area']].to_string(index=False))

# Time slot definitions
time_slots = {
    'MWF_9': 'Monday/Wednesday/Friday 9:00-9:50 AM',
    'MWF_11': 'Monday/Wednesday/Friday 11:00-11:50 AM',
    'TTh_130': 'Tuesday/Thursday 1:30-2:45 PM',
    'TTh_300': 'Tuesday/Thursday 3:00-4:15 PM'
}

print("\n🕐 Time Slots:")
for slot_id, slot_desc in time_slots.items():
    print(f"   {slot_id}: {slot_desc}")

# Course areas for balance requirement
areas = courses_df['Area'].unique()
print(f"\n🎯 Course Areas: {', '.join(areas)}")

print("\n💡 Challenge: Select courses to maximize learning while respecting constraints!")

In [None]:
# Build the NC State Course Scheduling MILP Model
print("🏗️ Building the Course Scheduling Optimization Model...")
print("=" * 55)

# Create the model
schedule_model = pyo.ConcreteModel()

# Sets
schedule_model.COURSES = pyo.Set(initialize=courses_df['Course_ID'].tolist())
schedule_model.TIME_SLOTS = pyo.Set(initialize=list(time_slots.keys()))
schedule_model.AREAS = pyo.Set(initialize=areas.tolist())

# Parameters
schedule_model.credits = pyo.Param(schedule_model.COURSES, initialize=dict(zip(courses_df['Course_ID'], courses_df['Credits'])))
schedule_model.learning_value = pyo.Param(schedule_model.COURSES, initialize=dict(zip(courses_df['Course_ID'], courses_df['Learning_Value'])))

# Course-to-time-slot mapping
course_time_mapping = {}
for _, row in courses_df.iterrows():
    course_time_mapping[row['Course_ID']] = row['Time_Slot']

# Course-to-area mapping
course_area_mapping = dict(zip(courses_df['Course_ID'], courses_df['Area']))

# Decision Variables
schedule_model.take_course = pyo.Var(schedule_model.COURSES, within=pyo.Binary)

print("✅ Model structure created:")
print(f"   - {len(schedule_model.COURSES)} courses available")
print(f"   - {len(schedule_model.TIME_SLOTS)} time slots")
print(f"   - {len(schedule_model.AREAS)} subject areas")

# Objective: Maximize total learning value
schedule_model.total_learning = pyo.Objective(
    expr=sum(schedule_model.learning_value[c] * schedule_model.take_course[c] for c in schedule_model.COURSES),
    sense=pyo.maximize
)

print("\n🎯 Objective: Maximize total learning value")

# Constraints
print("\n⚖️ Adding constraints:")

# 1. Credit hour requirements (9-12 credits)
schedule_model.min_credits = pyo.Constraint(
    expr=sum(schedule_model.credits[c] * schedule_model.take_course[c] for c in schedule_model.COURSES) >= 9
)
schedule_model.max_credits = pyo.Constraint(
    expr=sum(schedule_model.credits[c] * schedule_model.take_course[c] for c in schedule_model.COURSES) <= 12
)
print("   ✅ Credit hours: 9-12 credits required")

# 2. Time conflict constraints (can't take two courses at same time)
schedule_model.time_conflicts = pyo.ConstraintList()
for slot in schedule_model.TIME_SLOTS:
    courses_in_slot = [c for c in schedule_model.COURSES if course_time_mapping[c] == slot]
    if len(courses_in_slot) > 1:
        schedule_model.time_conflicts.add(
            sum(schedule_model.take_course[c] for c in courses_in_slot) <= 1
        )
print("   ✅ Time conflicts: Max 1 course per time slot")

# 3. Prerequisites constraints
schedule_model.prerequisites = pyo.ConstraintList()
for _, row in courses_df.iterrows():
    course = row['Course_ID']
    prereqs = row['Prerequisites']
    if prereqs:  # If course has prerequisites
        for prereq in prereqs:
            if prereq in schedule_model.COURSES:
                # If you take the course, you must have taken the prerequisite
                # For this example, we'll assume prerequisites are "recommended" not "required"
                # In a real system, you'd need historical enrollment data
                pass
print("   ✅ Prerequisites: Handled (simplified for this example)")

# 4. Area balance: Take courses from at least 2 different areas
schedule_model.area_balance = pyo.ConstraintList()
for area in schedule_model.AREAS:
    courses_in_area = [c for c in schedule_model.COURSES if course_area_mapping[c] == area]
    if len(courses_in_area) > 0:
        # Binary variable: 1 if we take any course in this area, 0 otherwise
        area_var = pyo.Var(within=pyo.Binary)
        schedule_model.add_component(f'area_{area}', area_var)
        
        # If we take any course in this area, area_var = 1
        schedule_model.area_balance.add(
            sum(schedule_model.take_course[c] for c in courses_in_area) <= len(courses_in_area) * area_var
        )
        schedule_model.area_balance.add(
            sum(schedule_model.take_course[c] for c in courses_in_area) >= area_var
        )

print("   ✅ Area balance: Encouraged diversity across areas")

# 5. Must take at least one core course (high learning value)
core_courses = [c for c in schedule_model.COURSES if schedule_model.learning_value[c] >= 8]
schedule_model.core_requirement = pyo.Constraint(
    expr=sum(schedule_model.take_course[c] for c in core_courses) >= 1
)
print("   ✅ Core requirement: At least one high-value course")

print("\n🏗️ Model building complete! Ready to solve...")

In [None]:
# Solve the NC State Course Scheduling Problem
print("🚀 Solving the NC State Course Scheduling Problem...")
print("=" * 55)

if available_solvers:
    # Use the best available solver
    solver = pyo.SolverFactory(available_solvers[0])
    
    # Solve with timing
    start_time = time.time()
    result = solver.solve(schedule_model, tee=False)  # tee=True shows solver output
    solve_time = time.time() - start_time
    
    print(f"⏱️ Solve time: {solve_time:.4f} seconds")
    print(f"🔧 Solver used: {available_solvers[0].upper()}")
    
    if result.solver.termination_condition == pyo.TerminationCondition.optimal:
        print("\n🎉 OPTIMAL SCHEDULE FOUND!")
        print("=" * 40)
        
        # Extract solution
        selected_courses = []
        total_credits = 0
        total_learning = 0
        
        for course in schedule_model.COURSES:
            if pyo.value(schedule_model.take_course[course]) > 0.5:  # Course is selected
                course_info = courses_df[courses_df['Course_ID'] == course].iloc[0]
                selected_courses.append({
                    'Course_ID': course,
                    'Course_Name': course_info['Course_Name'],
                    'Credits': course_info['Credits'],
                    'Learning_Value': course_info['Learning_Value'],
                    'Time_Slot': course_info['Time_Slot'],
                    'Area': course_info['Area']
                })
                total_credits += course_info['Credits']
                total_learning += course_info['Learning_Value']
        
        # Display results
        if selected_courses:
            selected_df = pd.DataFrame(selected_courses)
            print("📚 Your Optimal Course Schedule:")
            print(selected_df.to_string(index=False))
            
            print(f"\n📊 Summary:")
            print(f"   Total Credits: {total_credits}")
            print(f"   Total Learning Value: {total_learning}")
            print(f"   Number of Courses: {len(selected_courses)}")
            
            # Area breakdown
            area_counts = selected_df['Area'].value_counts()
            print(f"\n🎯 Area Distribution:")
            for area, count in area_counts.items():
                print(f"   {area}: {count} course(s)")
            
            # Time slot breakdown
            print(f"\n🕐 Schedule by Time Slot:")
            for slot in sorted(selected_df['Time_Slot'].unique()):
                courses_in_slot = selected_df[selected_df['Time_Slot'] == slot]['Course_ID'].tolist()
                print(f"   {slot}: {', '.join(courses_in_slot)}")
            
            print(f"\n💡 This schedule maximizes learning value while respecting all constraints!")
        else:
            print("❌ No courses selected - this shouldn't happen!")
    
    else:
        print("❌ No optimal solution found")
        print(f"Solver status: {result.solver.termination_condition}")
        print("This might indicate infeasible constraints or solver issues.")

else:
    print("❌ No solvers available. Please install a solver first.")
    print("Try running: !pip install coinor-cbc")

### 🎯 Interactive Exercise: Modify Your Schedule

Now it's your turn! Try modifying the course scheduling model:

1. **Change the credit requirements**: What if you need 15 credits instead of 12?
2. **Add personal preferences**: Give extra learning value to courses you're interested in
3. **Add time preferences**: Maybe you don't want early morning classes?
4. **Change the objective**: Instead of maximizing learning, try minimizing the number of courses

**This is how optimization helps with real decisions!**

---

# Part 4: Solver Performance Comparison 🏎️

One key advantage of Pyomo is solver-agnostic modeling. Let's demonstrate the difference between open-source and commercial solvers using a challenging problem.

## The Campus Tour Problem (TSP)

**Challenge**: Visit key NC State locations in the shortest possible route:
- Hunt Library (your second home)
- Fitts-Woolard Hall (ISE Department)
- Talley Student Union (food!)
- D.H. Hill Library (backup study spot)
- Engineering Building I (classic)
- Free Expression Tunnel (campus tradition)
- Bell Tower (campus landmark)
- Carmichael Gym (fitness goals)

This is a **Traveling Salesman Problem (TSP)** - a classic optimization problem that gets very difficult very quickly!

In [None]:
# NC State Campus Tour TSP Problem
print("🏫 Setting up the NC State Campus Tour Problem...")
print("=" * 50)

# Campus locations with approximate coordinates (relative distances)
locations = {
    'Hunt_Library': (0, 0),
    'Fitts_Woolard': (0.3, 0.5),
    'Talley_Union': (0.5, 0.2),
    'DH_Hill': (0.7, 0.3),
    'Engineering_I': (0.4, 0.8),
    'Free_Tunnel': (0.8, 0.6),
    'Bell_Tower': (0.6, 0.4),
    'Carmichael': (0.9, 0.1)
}

# Calculate distance matrix (Euclidean distance)
import math

def calculate_distance(loc1, loc2):
    return math.sqrt((loc1[0] - loc2[0])**2 + (loc1[1] - loc2[1])**2)

location_names = list(locations.keys())
n_locations = len(location_names)

# Create distance matrix - only for pairs where i != j (matching ARCS set)
distance_matrix = {}
for i, loc1 in enumerate(location_names):
    for j, loc2 in enumerate(location_names):
        if i != j:  # Only include non-diagonal entries
            distance_matrix[(i, j)] = calculate_distance(locations[loc1], locations[loc2])

print(f"🗺️ Campus locations: {', '.join(location_names)}")
print(f"📏 Distance matrix calculated for {n_locations} locations")

# Display location coordinates
print("\n📍 Location Coordinates:")
for name, coords in locations.items():
    print(f"   {name}: ({coords[0]:.1f}, {coords[1]:.1f})")

# Quick visualization
plt.figure(figsize=(10, 8))
for name, coords in locations.items():
    plt.plot(coords[0], coords[1], 'ro', markersize=10)
    plt.annotate(name.replace('_', ' '), coords, xytext=(5, 5), textcoords='offset points', fontsize=10)

plt.xlim(-0.1, 1.1)
plt.ylim(-0.1, 1.0)
plt.xlabel('X Coordinate')
plt.ylabel('Y Coordinate')
plt.title('NC State Campus Tour Locations')
plt.grid(True, alpha=0.3)
plt.show()

print("\n🎯 Challenge: Find the shortest route visiting all locations exactly once!")

In [None]:
# Build TSP Model for Campus Tour
print("🏗️ Building the Campus Tour TSP Model...")
print("=" * 45)

# Create TSP model
tsp_model = pyo.ConcreteModel()

# Sets
tsp_model.LOCATIONS = pyo.Set(initialize=range(n_locations))
tsp_model.ARCS = pyo.Set(initialize=[(i, j) for i in range(n_locations) for j in range(n_locations) if i != j])

# Parameters
tsp_model.distance = pyo.Param(tsp_model.ARCS, initialize=distance_matrix)

# Decision variables
tsp_model.x = pyo.Var(tsp_model.ARCS, within=pyo.Binary)  # 1 if we travel from i to j

# Objective: minimize total distance
tsp_model.total_distance = pyo.Objective(
    expr=sum(tsp_model.distance[i, j] * tsp_model.x[i, j] for (i, j) in tsp_model.ARCS),
    sense=pyo.minimize
)

# Constraints
# 1. Each location must be visited exactly once (outgoing)
tsp_model.leave_once = pyo.ConstraintList()
for i in tsp_model.LOCATIONS:
    tsp_model.leave_once.add(sum(tsp_model.x[i, j] for j in tsp_model.LOCATIONS if j != i) == 1)

# 2. Each location must be visited exactly once (incoming)
tsp_model.enter_once = pyo.ConstraintList()
for j in tsp_model.LOCATIONS:
    tsp_model.enter_once.add(sum(tsp_model.x[i, j] for i in tsp_model.LOCATIONS if i != j) == 1)

# 3. Subtour elimination (Miller-Tucker-Zemlin formulation)
tsp_model.u = pyo.Var(tsp_model.LOCATIONS, within=pyo.NonNegativeReals, bounds=(0, n_locations-1))
tsp_model.subtour_elimination = pyo.ConstraintList()
for i in tsp_model.LOCATIONS:
    for j in tsp_model.LOCATIONS:
        if i != j and i != 0 and j != 0:  # Exclude depot (location 0)
            tsp_model.subtour_elimination.add(
                tsp_model.u[i] - tsp_model.u[j] + n_locations * tsp_model.x[i, j] <= n_locations - 1
            )

print("✅ TSP model built successfully:")
print(f"   - {n_locations} locations")
print(f"   - {len(tsp_model.ARCS)} possible routes")
print(f"   - Objective: Minimize total tour distance")
print(f"   - Constraints: Visit each location exactly once, no subtours")

print("\n🎯 This is a challenging problem - let's see how different solvers perform!")

In [None]:
# Solve TSP with Available Solvers and Compare Performance
print("🏁 Solving Campus Tour TSP with Different Solvers...")
print("=" * 55)

solver_results = {}

# Test each available solver
for solver_name in available_solvers:
    print(f"\n🔧 Testing {solver_name.upper()} solver...")
    
    try:
        solver = pyo.SolverFactory(solver_name)
        
        # Set time limit for fair comparison
        if solver_name == 'gurobi':
            solver.options['TimeLimit'] = 30
        elif solver_name == 'cbc':
            solver.options['seconds'] = 30
        elif solver_name == 'glpk':
            solver.options['tmlim'] = 30
        
        # Solve with timing
        start_time = time.time()
        result = solver.solve(tsp_model, tee=False)
        solve_time = time.time() - start_time
        
        # Store results
        solver_results[solver_name] = {
            'solve_time': solve_time,
            'status': result.solver.termination_condition,
            'objective': None,
            'solution': None
        }
        
        if result.solver.termination_condition == pyo.TerminationCondition.optimal:
            objective_value = pyo.value(tsp_model.total_distance)
            solver_results[solver_name]['objective'] = objective_value
            
            # Extract tour
            tour = []
            current = 0  # Start at Hunt Library
            visited = set([current])
            
            while len(visited) < n_locations:
                for j in tsp_model.LOCATIONS:
                    if j not in visited and pyo.value(tsp_model.x[current, j]) > 0.5:
                        tour.append((current, j))
                        visited.add(j)
                        current = j
                        break
            
            # Close the tour
            tour.append((current, 0))
            solver_results[solver_name]['solution'] = tour
            
            print(f"   ✅ {solver_name.upper()}: Optimal solution found!")
            print(f"      Solve time: {solve_time:.4f} seconds")
            print(f"      Total distance: {objective_value:.4f}")
            
        else:
            print(f"   ⚠️ {solver_name.upper()}: {result.solver.termination_condition}")
            print(f"      Solve time: {solve_time:.4f} seconds")
    
    except Exception as e:
        print(f"   ❌ {solver_name.upper()}: Error - {str(e)}")
        solver_results[solver_name] = {
            'solve_time': None,
            'status': 'Error',
            'objective': None,
            'solution': None
        }

# Summary comparison
print("\n🏆 SOLVER PERFORMANCE COMPARISON")
print("=" * 45)

if solver_results:
    comparison_data = []
    for solver_name, results in solver_results.items():
        comparison_data.append({
            'Solver': solver_name.upper(),
            'Status': str(results['status']),
            'Time (s)': f"{results['solve_time']:.4f}" if results['solve_time'] else "N/A",
            'Objective': f"{results['objective']:.4f}" if results['objective'] else "N/A"
        })
    
    comparison_df = pd.DataFrame(comparison_data)
    print(comparison_df.to_string(index=False))
    
    # Find best solution
    optimal_results = {k: v for k, v in solver_results.items() if v['objective'] is not None}
    if optimal_results:
        best_solver = min(optimal_results.keys(), key=lambda x: optimal_results[x]['objective'])
        best_solution = optimal_results[best_solver]['solution']
        
        print(f"\n🎯 Best Solution: {best_solver.upper()}")
        print(f"   Total Distance: {optimal_results[best_solver]['objective']:.4f}")
        print(f"   Tour: ", end="")
        for i, (start, end) in enumerate(best_solution):
            if i == 0:
                print(f"{location_names[start]} → ", end="")
            print(f"{location_names[end]}", end="")
            if i < len(best_solution) - 1:
                print(" → ", end="")
        print("")
    
    print("\n💡 Key Insights:")
    print("   - Commercial solvers (Gurobi, CPLEX) typically solve faster")
    print("   - Open-source solvers (CBC, GLPK) are free but may be slower")
    print("   - For research, you often need both: open-source for development, commercial for production")
    print("   - Academic licenses make commercial solvers accessible for students!")
    
else:
    print("No solver results to compare.")

---

# Part 5: Your Turn - Interactive Practice! 🎮

Now it's time to apply what you've learned! Choose one of these exercises to practice your optimization skills:

## Exercise Options:

### Option A: Personal Schedule Optimization
Create an optimization model for your daily schedule:
- **Variables**: Time spent on research, coursework, exercise, sleep, social activities
- **Objective**: Maximize "happiness" or "productivity"
- **Constraints**: Total time = 24 hours, minimum sleep, minimum research time, etc.

### Option B: Research Resource Allocation
Optimize how to spend your research budget:
- **Variables**: Money spent on conferences, equipment, software, books
- **Objective**: Maximize research impact
- **Constraints**: Total budget, minimum conference attendance, etc.

### Option C: Diet Optimization for PhD Students
Optimize your meal plan for maximum energy and minimum cost:
- **Variables**: Meals from different sources (campus, home cooking, takeout)
- **Objective**: Maximize nutrition while minimizing cost
- **Constraints**: Calorie requirements, time constraints, budget limits

**Pick one and implement it below!**

In [None]:
# Your Practice Exercise - Fill in your optimization model here!
print("🎯 Your Turn: Implement Your Optimization Model!")
print("=" * 50)

# Example template for Personal Schedule Optimization
# Feel free to modify this for your chosen exercise

# Step 1: Create your model
your_model = pyo.ConcreteModel()

# Step 2: Define decision variables
# Example: Time allocation variables
your_model.research_time = pyo.Var(within=pyo.NonNegativeReals)
your_model.coursework_time = pyo.Var(within=pyo.NonNegativeReals)
your_model.exercise_time = pyo.Var(within=pyo.NonNegativeReals)
your_model.sleep_time = pyo.Var(within=pyo.NonNegativeReals)
your_model.social_time = pyo.Var(within=pyo.NonNegativeReals)

print("✅ Decision variables defined")

# Step 3: Define objective function
# Example: Maximize happiness (you define the happiness function!)
your_model.happiness = pyo.Objective(
    expr=(
        3 * your_model.research_time +      # Research gives 3 happiness per hour
        2 * your_model.coursework_time +    # Coursework gives 2 happiness per hour
        4 * your_model.exercise_time +      # Exercise gives 4 happiness per hour
        2 * your_model.sleep_time +         # Sleep gives 2 happiness per hour
        5 * your_model.social_time          # Social time gives 5 happiness per hour
    ),
    sense=pyo.maximize
)

print("✅ Objective function defined")

# Step 4: Add constraints
# Time constraint: Total time = 24 hours
your_model.time_constraint = pyo.Constraint(
    expr=your_model.research_time + your_model.coursework_time + your_model.exercise_time + 
         your_model.sleep_time + your_model.social_time == 24
)

# Minimum sleep constraint
your_model.min_sleep = pyo.Constraint(expr=your_model.sleep_time >= 6)

# Minimum research constraint (you're a PhD student!)
your_model.min_research = pyo.Constraint(expr=your_model.research_time >= 4)

# Add your own constraints here!
# Example: Maximum social time
your_model.max_social = pyo.Constraint(expr=your_model.social_time <= 6)

# Minimum exercise (stay healthy!)
your_model.min_exercise = pyo.Constraint(expr=your_model.exercise_time >= 1)

print("✅ Constraints added")

# Step 5: Solve your model
if available_solvers:
    solver = pyo.SolverFactory(available_solvers[0])
    result = solver.solve(your_model)
    
    if result.solver.termination_condition == pyo.TerminationCondition.optimal:
        print("\n🎉 YOUR OPTIMAL SOLUTION:")
        print("=" * 30)
        print(f"Research time: {pyo.value(your_model.research_time):.2f} hours")
        print(f"Coursework time: {pyo.value(your_model.coursework_time):.2f} hours")
        print(f"Exercise time: {pyo.value(your_model.exercise_time):.2f} hours")
        print(f"Sleep time: {pyo.value(your_model.sleep_time):.2f} hours")
        print(f"Social time: {pyo.value(your_model.social_time):.2f} hours")
        print(f"\nTotal happiness: {pyo.value(your_model.happiness):.2f}")
        
        # Verification
        total_time = (pyo.value(your_model.research_time) + 
                     pyo.value(your_model.coursework_time) + 
                     pyo.value(your_model.exercise_time) + 
                     pyo.value(your_model.sleep_time) + 
                     pyo.value(your_model.social_time))
        print(f"\nVerification: Total time = {total_time:.2f} hours")
        
        print("\n💡 Questions to consider:")
        print("   - Does this schedule seem realistic?")
        print("   - What if you change the happiness weights?")
        print("   - What additional constraints would make this more realistic?")
        
    else:
        print("❌ No optimal solution found")
        print("Check your constraints - they might be infeasible!")
else:
    print("❌ No solvers available")

print("\n🎓 Great job! You've built your own optimization model!")

### 🤔 Reflection Questions:

1. **Model Realism**: How realistic is your optimal solution? What constraints would you add to make it more realistic?

2. **Sensitivity Analysis**: What happens if you change the objective function weights? Try doubling the weight on exercise or sleep.

3. **Trade-offs**: What trade-offs does your model reveal? What activities compete for your time?

4. **Extensions**: How could you extend this model? Consider uncertainty, multiple objectives, or dynamic scheduling.

**This is the power of optimization - it forces you to think clearly about your objectives and constraints!**

---

# Block 4 Wrap-Up: From Data to Decisions 🎯

## What You've Accomplished Today

Congratulations! In this 50-minute block, you've:

✅ **Mastered Optimization Fundamentals**
- Understood the three components: variables, objective, constraints
- Visualized linear programming graphically
- Learned the difference between LP and MILP

✅ **Learned Pyomo Modeling**
- Built optimization models in Python
- Used solver-agnostic modeling for flexibility
- Solved real-world decision problems

✅ **Applied OR to Real Problems**
- Optimized PhD student schedules and budgets
- Solved course scheduling with multiple constraints
- Tackled the classic Traveling Salesman Problem

✅ **Compared Solver Performance**
- Understood open-source vs commercial solver trade-offs
- Experienced the power of different optimization engines
- Learned about academic licensing opportunities

## The Optimization Mindset

Remember the key insight from today:

**Optimization is not just about finding the best solution - it's about understanding the structure of your decision problem.**

When you formulate an optimization model, you're forced to:
- **Clarify your objectives**: What do you really want to optimize?
- **Identify constraints**: What limitations must you respect?
- **Make trade-offs explicit**: When resources are limited, what matters most?

## Applications in PhD Research

As ISE/OR PhD students, you'll use optimization for:

- **Research Design**: Optimize experimental designs, sample sizes, resource allocation
- **Algorithm Development**: Create new optimization algorithms and heuristics
- **Real-World Applications**: Solve industry problems in manufacturing, logistics, healthcare
- **Policy Analysis**: Optimize resource allocation for social and economic systems

## Next Steps

To continue your optimization journey:

1. **Practice More**: Try the exercises with different parameters and constraints
2. **Explore Advanced Topics**: Stochastic programming, robust optimization, nonlinear programming
3. **Use Real Data**: Apply these techniques to your research domain
4. **Join the Community**: Participate in optimization conferences and workshops
5. **Consider Gurobi Academic License**: Free for research and coursework

## Looking Ahead

In the remaining blocks, we'll explore:
- **Block 5**: Version control and reproducible research workflows
- **Block 6**: AI tools for research productivity

## Resources for Further Learning

- **Pyomo Documentation**: https://pyomo.readthedocs.io/
- **Gurobi Academic Program**: https://www.gurobi.com/academia/
- **OR-Tools by Google**: https://developers.google.com/optimization
- **INFORMS (Operations Research Society)**: https://www.informs.org/

---

## Thank You! 🙏

Great work in Block 4! You've taken a significant step in your journey from data to decisions. In the next block, we'll explore the tools that will help you manage and share your optimization models effectively.

**Remember: The best optimization model is the one that helps you make better decisions!**

---

*Questions? Feel free to ask during the break or reach out via email: wkkirsch@ncsu.edu*