<a href="https://colab.research.google.com/github/sreent/machine-learning/blob/main/Linear%20Regression/Linear%20Regression%20Code%20Walk%20Through.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Linear Regression: Code Walk Through

This notebook walks through the **computational steps** of the Linear Regression algorithm from scratch, based on lecture slides 28-32.

## What We'll Cover:
1. **Visualize the data** - understand the dataset
2. **Add bias term** - transform data to include intercept
3. **Find best fit line** - compute optimal weights using closed-form solution
4. **Make predictions** - use learned weights to predict new values

We'll show **both loop and vectorized approaches** to understand the logic and efficient implementation.

### Key Concept:
- Linear regression finds the **best fit line** through the data
- Uses **closed-form solution** (no iterative training needed!)
- Formula: **y = w₁x + w₀** where w₀ is intercept and w₁ is slope

## Step 1: Import Libraries

We need:
- **NumPy** for numerical operations and matrix calculations
- **Matplotlib** for visualization

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Set random seed for reproducibility
np.random.seed(42)

## Step 2: Create Training Data

We create training data similar to the lecture slides (slide 28):
- Generate x values from a range
- Create y values with a linear relationship plus some noise
- This simulates real-world data where measurements have some randomness

In [None]:
# Create training data (similar to slide 28)
# Generate x values
X_train = np.arange(-9.5, 8.5, 0.1).reshape(-1, 1)

# Generate y values: y ≈ 1.035x + 1.069 with some noise
true_slope = 1.035
true_intercept = 1.069
noise = np.random.normal(0, 2, X_train.shape[0])
y_train = true_slope * X_train.ravel() + true_intercept + noise

print("Training data shape:", X_train.shape)
print("Target values shape:", y_train.shape)
print(f"\nNumber of training points: {len(X_train)}")
print(f"X range: [{X_train.min():.1f}, {X_train.max():.1f}]")
print(f"y range: [{y_train.min():.1f}, {y_train.max():.1f}]")

## Step 3: Visualize the Data

Let's plot our training data to see the relationship between x and y.

We can see the points roughly follow a **linear trend** - perfect for linear regression!

In [None]:
# Scatter plot of training data
plt.figure(figsize=(10, 6))
plt.scatter(X_train, y_train,
           c='steelblue', s=30, alpha=0.6,
           edgecolors='black', linewidths=0.5,
           label='Training data')
plt.xlabel('x', fontsize=14)
plt.ylabel('y', fontsize=14)
plt.title('Training Data: Looking for Linear Relationship', fontsize=16)
plt.legend(fontsize=12)
plt.grid(True, alpha=0.3)
plt.show()

print(f"We have {len(X_train)} training points")
print(f"Goal: Find the line y = w₁x + w₀ that best fits this data")

## Step 4: Add Column of Ones (Bias Term)

**Slide 29** shows the first step: adding a column of 1s to include the intercept term.

**Why?**
- Our model is: **y = w₁x + w₀**
- We can rewrite this as: **y = w₀(1) + w₁x**
- In matrix form: **y = [1, x] × [w₀, w₁]ᵀ**

**Transformation:**
- Original: **[x⁽¹⁾, x⁽²⁾, ..., x⁽ᴺ⁾]**
- With bias: **[[1, x⁽¹⁾], [1, x⁽²⁾], ..., [1, x⁽ᴺ⁾]]**

This matrix is called **Φ** (Phi) or the **design matrix**.

In [None]:
# Add column of ones using np.c_[]
Phi = np.c_[np.ones(len(X_train)), X_train]

print("Original X_train shape:", X_train.shape)
print("Design matrix Φ shape:", Phi.shape)
print("\nFirst few rows of Φ:")
print(Phi[:5])
print("\nEach row is now: [1, x]")

## Step 5: Find Best Fit Line (Closed-Form Solution)

**Slide 30** shows how we compute the optimal weights using the **normal equation**:

$$\mathbf{w} = (\Phi^T \Phi)^{-1} \Phi^T \mathbf{y}$$

Where:
- **Φ** is our design matrix (with bias column)
- **y** is our target values
- **w** = [w₀, w₁] are the weights (intercept and slope)

This gives us the **exact solution** in one computation (no iterative training!).

In [None]:
# Compute weights using the closed-form solution
# w = (Φᵀ Φ)⁻¹ Φᵀ y
weights = np.linalg.inv(Phi.T @ Phi) @ Phi.T @ y_train

print("Weights computed using normal equation:")
print(f"\nWeights = {weights}")
print()
print(f"Intercept (w₀): {weights[0]:.3f}")
print(f"Slope (w₁):     {weights[1]:.3f}")
print()
print(f"Final linear model: y = {weights[1]:.3f}x + {weights[0]:.3f}")

## Step 6: Visualize the Best Fit Line

Let's plot our learned line along with the training data to see how well it fits!

In [None]:
# Create x values for plotting the line
x_line = np.linspace(-10, 10, 200)

# Compute y values using our learned weights: y = w₁x + w₀
y_line = weights[1] * x_line + weights[0]

# Plot
plt.figure(figsize=(10, 6))

# Training data
plt.scatter(X_train, y_train,
           c='steelblue', s=30, alpha=0.6,
           edgecolors='black', linewidths=0.5,
           label='Training data', zorder=2)

# Best fit line
plt.plot(x_line, y_line,
        'r-', linewidth=3, alpha=0.8,
        label=f'y = {weights[1]:.3f}x + {weights[0]:.3f}',
        zorder=3)

plt.xlabel('x', fontsize=14)
plt.ylabel('y', fontsize=14)
plt.title('Linear Regression: Best Fit Line', fontsize=16)
plt.legend(fontsize=12)
plt.grid(True, alpha=0.3)
plt.show()

print(f"Model equation: y = {weights[1]:.3f}x + {weights[0]:.3f}")

## Step 7: Add Bias Term to Test Data

**Slide 31** shows how we prepare the test point for prediction.

Just like with training data, we need to add a column of 1s to our test data.

**Transformation:** [5] → [1, 5]

This ensures our test point has the same format as the training data.

In [None]:
# Test point (from slide 28)
X_test = np.array([[5.0]])

print("Original test point:", X_test[0])
print("Shape:", X_test.shape)
print()

# Add bias term
X_test_with_bias = np.c_[np.ones(len(X_test)), X_test]

print("Test point with bias:", X_test_with_bias[0])
print("Shape:", X_test_with_bias.shape)
print()
print("Now it has the form [1, x] to match our weights [w₀, w₁]")

## Step 8: Make Prediction Using Matrix Multiplication

**Slide 32** shows the prediction calculation:

$$\hat{y} = [1, x] \times [w_0, w_1]^T = 1 \times w_0 + x \times w_1$$

For our test point x = 5:
- ŷ = [1, 5] × [w₀, w₁]
- ŷ = 1×w₀ + 5×w₁

In [None]:
# Prediction using matrix multiplication
prediction = X_test_with_bias @ weights

print("Matrix multiplication approach:")
print("="*50)
print(f"Test point: x = {X_test[0,0]}")
print(f"Test point with bias: {X_test_with_bias[0]}")
print(f"Weights: {weights}")
print()
print(f"ŷ = [1, {X_test[0,0]}] × [{weights[0]:.3f}, {weights[1]:.3f}]")
print(f"ŷ = 1×{weights[0]:.3f} + {X_test[0,0]}×{weights[1]:.3f}")
print(f"ŷ = {weights[0]:.3f} + {X_test[0,0] * weights[1]:.3f}")
print(f"ŷ = {prediction[0]:.3f}")

## Step 9: Visualize the Prediction

Let's visualize our prediction on the fitted line, matching slide 28.

In [None]:
# Create x values for plotting the line
x_line = np.linspace(-10, 10, 200)
y_line = weights[1] * x_line + weights[0]

# Plot
plt.figure(figsize=(10, 6))

# Training data
plt.scatter(X_train, y_train,
           c='steelblue', s=30, alpha=0.6,
           edgecolors='black', linewidths=0.5,
           label='Training data', zorder=2)

# Best fit line
plt.plot(x_line, y_line,
        'r-', linewidth=3, alpha=0.8,
        label=f'y = {weights[1]:.3f}x + {weights[0]:.3f}',
        zorder=3)

# Test point and prediction
plt.scatter(X_test, prediction,
           c='red', s=400, marker='*',
           edgecolors='black', linewidths=2,
           label=f'Test point: x={X_test[0,0]:.1f}, ŷ={prediction[0]:.3f}',
           zorder=4)

# Draw dashed lines to show prediction (similar to slide 28)
plt.plot([X_test[0,0], X_test[0,0]], [plt.ylim()[0], prediction[0]], 
        'k--', alpha=0.5, linewidth=1.5)
plt.plot([plt.xlim()[0], X_test[0,0]], [prediction[0], prediction[0]], 
        'k--', alpha=0.5, linewidth=1.5)

plt.xlabel('x', fontsize=14)
plt.ylabel('y', fontsize=14)
plt.title('Linear Regression: Making a Prediction', fontsize=16)
plt.legend(fontsize=11, loc='upper left')
plt.grid(True, alpha=0.3)
plt.show()

print(f"\nFor test point x = {X_test[0,0]:.1f}:")
print(f"Predicted value ŷ = {prediction[0]:.3f}")

## Summary

We've walked through all the computational steps of Linear Regression (slides 28-32):

1. ✅ **Visualized data** - saw training points showing linear trend
2. ✅ **Added bias term** (Slide 29) - transformed data by adding column of 1s to create design matrix Φ
3. ✅ **Found best fit line** (Slide 30) - used closed-form solution **w = (ΦᵀΦ)⁻¹Φᵀy** to compute optimal weights
4. ✅ **Prepared test data** (Slide 31) - added bias term to test point
5. ✅ **Made prediction** (Slide 32) - computed ŷ = [1, x] · [w₀, w₁] for new point

### Key Linear Regression Concepts:

| Concept | Description |
|---------|-------------|
| **Design Matrix (Φ)** | Training data with bias column: [[1, x⁽¹⁾], [1, x⁽²⁾], ...] |
| **Weights (w)** | [w₀, w₁] where w₀ = intercept, w₁ = slope |
| **Normal Equation** | w = (ΦᵀΦ)⁻¹Φᵀy gives optimal solution directly |
| **Prediction** | ŷ = w₀ + w₁x (or [1,x] · w in matrix form) |

### Key NumPy Operations Used:

- **`np.c_[ones, X]`** - concatenate columns to add bias term
- **`.T`** - transpose matrix (Φ → Φᵀ)
- **`@`** - matrix multiplication operator
- **`np.linalg.inv()`** - compute matrix inverse
- **`Phi.T @ Phi`** - compute ΦᵀΦ (Gram matrix)
- **`X @ w`** - compute predictions via matrix-vector multiplication

### Linear Regression vs KNN Regression:

| Aspect | Linear Regression | KNN Regression |
|--------|------------------|----------------|
| **Model** | Parametric (learns fixed weights) | Non-parametric (uses training data directly) |
| **Training** | Closed-form solution (instant) | No training needed |
| **Prediction** | Fast (just w₀ + w₁x) | Slower (must find K neighbors) |
| **Assumes** | Linear relationship | Local similarity |
| **Memory** | Only stores weights | Must store all training data |

### Why This Approach?

- **Step-by-step breakdown** helps you understand the mathematical operations
- **Matrix operations** provide efficient implementation

In practice, linear regression is powerful when your data has a linear relationship, and it's much faster than iterative methods!