In [None]:
import marimo as mo

# Week 1: Introduction to Machine Learning**IME775: Data Driven Modeling and Optimization**ðŸ“– **Reference**: Watt, Borhani, & Katsaggelos (2020). *Machine Learning Refined* (2nd ed.), **Chapter 1**---## Learning ObjectivesBy the end of this session, you will be able to:- Define machine learning and its role in data-driven decision making- Understand the basic taxonomy of ML problems (supervised, unsupervised)- Connect machine learning to mathematical optimization- Identify regression vs classification problems

In [None]:
import numpy as npimport matplotlib.pyplot as plt

## What is Machine Learning?> "Machine learning is a term used to describe a broad collection of pattern-finding > algorithms designed to properly identify system rules empirically by leveraging > enormous amounts of data and computing power." â€” *ML Refined, Ch. 1*### The Core IdeaGiven **data** about a system, find a **rule** (model) that:- Accurately describes the relationship between inputs and outputs- Generalizes to new, unseen data- Enables prediction and decision-making

## Example: Distinguishing Cats from Dogs (Section 1.2)The classic ML problem from Chapter 1:1. **Collect data**: Images labeled as "cat" or "dog"2. **Extract features**: Pixel values, edge patterns, etc.3. **Learn a rule**: Find decision boundary separating cats from dogs4. **Make predictions**: Classify new imagesThis is a **supervised learning** problem with **binary classification**.

In [None]:
# Visualize a simple classification problemnp.random.seed(42)# Generate two classes of datan = 50class_0 = np.random.randn(n, 2) + np.array([2, 2])class_1 = np.random.randn(n, 2) + np.array([-2, -2])fig, ax = plt.subplots(figsize=(10, 8))ax.scatter(class_0[:, 0], class_0[:, 1], c='blue', s=80,            label='Class 0 (e.g., Cats)', alpha=0.7, edgecolors='black')ax.scatter(class_1[:, 0], class_1[:, 1], c='red', s=80,            label='Class 1 (e.g., Dogs)', alpha=0.7, edgecolors='black')# Decision boundaryx_line = np.linspace(-5, 5, 100)ax.plot(x_line, x_line, 'k--', linewidth=2, label='Decision Boundary')ax.set_xlabel('Feature 1', fontsize=12)ax.set_ylabel('Feature 2', fontsize=12)ax.set_title('Binary Classification Problem (ML Refined, Section 1.2)', fontsize=14)ax.legend()ax.grid(True, alpha=0.3)ax.set_aspect('equal')fig

## The Basic Taxonomy of ML Problems (Section 1.3)### Supervised Learning**Given**: Input-output pairs $(x_i, y_i)$**Goal**: Learn mapping $f: x \rightarrow y$| Problem Type | Output $y$ | Examples ||--------------|------------|----------|| **Regression** | Continuous | House prices, temperature || **Classification** | Discrete | Spam/not spam, disease diagnosis |### Unsupervised Learning**Given**: Only inputs $x_i$ (no labels)**Goal**: Find structure or patterns| Problem Type | Goal | Examples ||--------------|------|----------|| **Clustering** | Group similar data | Customer segmentation || **Dimensionality Reduction** | Compress data | PCA, autoencoders |

In [None]:
# Regression vs Classification visualizationfig2, axes = plt.subplots(1, 2, figsize=(14, 5))np.random.seed(42)# Regression exampleax1 = axes[0]x_reg = np.linspace(0, 10, 50)y_reg = 2 * x_reg + 3 + np.random.randn(50) * 2ax1.scatter(x_reg, y_reg, alpha=0.7, s=50)ax1.plot(x_reg, 2 * x_reg + 3, 'r-', linewidth=2, label='Learned rule')ax1.set_xlabel('Input x')ax1.set_ylabel('Output y (continuous)')ax1.set_title('Regression: Continuous Output')ax1.legend()ax1.grid(True, alpha=0.3)# Classification exampleax2 = axes[1]x1 = np.random.randn(30) + 2y1 = np.random.randn(30) + 2x2 = np.random.randn(30) - 2y2 = np.random.randn(30) - 2ax2.scatter(x1, y1, c='blue', s=50, label='Class 0', alpha=0.7)ax2.scatter(x2, y2, c='red', s=50, label='Class 1', alpha=0.7)ax2.set_xlabel('Feature 1')ax2.set_ylabel('Feature 2')ax2.set_title('Classification: Discrete Output')ax2.legend()ax2.grid(True, alpha=0.3)plt.tight_layout()fig2

## Mathematical Optimization in ML (Section 1.4)Machine learning is fundamentally an **optimization problem**:$$\min_{\theta} \text{Cost}(\theta; \text{data})$$Where:- $\theta$: Model parameters (weights, coefficients)- Cost: Measures how well the model fits the data- data: Training examples### The Learning Problem1. **Choose a model**: Linear, polynomial, neural network, etc.2. **Define a cost function**: MSE, cross-entropy, etc.3. **Optimize**: Find $\theta^*$ that minimizes the cost4. **Predict**: Use learned $\theta^*$ on new data

## Interactive: Explore Linear RegressionAdjust the parameters to see how the line fits the data.

In [None]:
slope_slider = mo.ui.slider(-5, 5, value=2, step=0.1, label="Slope")intercept_slider = mo.ui.slider(-10, 10, value=3, step=0.5, label="Intercept")mo.vstack([slope_slider, intercept_slider])

In [None]:
# Interactive regressionnp.random.seed(42)x_data = np.linspace(0, 10, 30)y_true = 2 * x_data + 3y_data = y_true + np.random.randn(30) * 2slope = slope_slider.valueintercept = intercept_slider.valuey_pred = slope * x_data + intercept# Calculate MSEmse = np.mean((y_data - y_pred)**2)fig3, ax3 = plt.subplots(figsize=(10, 6))ax3.scatter(x_data, y_data, alpha=0.7, s=50, label='Data')ax3.plot(x_data, y_pred, 'r-', linewidth=2,          label=f'Model: y = {slope:.1f}x + {intercept:.1f}')ax3.plot(x_data, y_true, 'g--', linewidth=1, alpha=0.5, label='True: y = 2x + 3')ax3.set_xlabel('x')ax3.set_ylabel('y')ax3.set_title(f'Cost (MSE) = {mse:.2f}')ax3.legend()ax3.grid(True, alpha=0.3)fig3

## Summary| Concept | Key Points ||---------|------------|| **Machine Learning** | Finding rules/patterns from data || **Supervised Learning** | Learn from labeled input-output pairs || **Unsupervised Learning** | Find structure in unlabeled data || **Regression** | Predict continuous outputs || **Classification** | Predict discrete classes || **Optimization** | ML = minimizing a cost function |---## References- **Primary**: Watt, J., Borhani, R., & Katsaggelos, A. K. (2020). *Machine Learning Refined* (2nd ed.), Chapter 1.- **Supplementary**: James, G. et al. (2023). *An Introduction to Statistical Learning*, Chapter 1.## Next Week**Zero-Order Optimization Techniques** (Chapter 2): Global and local optimization methods without derivatives.