In [1]:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

In [None]:
""" 
The equation is:
 y = β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ + ε

EAMPLE: Predicting house prices
    Price = β₀ + β₁(Size) + β₂(Bedrooms) + β₃(Age) + ε

MATRIX FORM

We can write multiple regression in matrix form:
    Y = Xβ + ε

Where:
- Y is an (n×1) vector of outcomes
- X is an (n×p) matrix of predictors (includes column of 1s for intercept)
- β is a (p×1) vector of coefficients
- ε is an (n×1) vector of errors

NORMAL EQUATION (Least Squares Solution)

To find the best β that minimizes the sum of squared errors, we use:

    β = (XᵀX)⁻¹Xᵀy

DERIVATION:
1. We want to minimize: RSS = ||Y - Xβ||²
2. RSS = (Y - Xβ)ᵀ(Y - Xβ)
3. Expand: RSS = YᵀY - 2βᵀXᵀY + βᵀXᵀXβ
4. Take derivative with respect to β: ∂RSS/∂β = -2XᵀY + 2XᵀXβ
5. Set to zero: -2XᵀY + 2XᵀXβ = 0
6. Solve for β: XᵀXβ = XᵀY
7. Final solution: β = (XᵀX)⁻¹XᵀY

"""

In [None]:
# Creating dummy data 

# Set random seed for reproducibility 
np.random.seed(42)
# Genarate dummy data 
n_sample = 200

# Feature 1: House size (1000- 3000 sqft)
# Every value in that range has equal probability (that's what "uniform" means)
size = np.random.uniform(1000, 3000, n_sample)

# Feature 2: Number of bedrooms (2- 5)
bedrooms = np.random.randint(2, 6, n_sample)

# Feature 3: House age (0 - 30 years)
age = np.random.uniform(0, 30, n_sample)

# True relationship (what we are trying to discover)
# Price = 50000 + 150 * size  + 20000 * bedrooms - 2000 * age + noise

true_intercept = 50000
true_coef_size = 150
true_coef_bedrooms = 20000
true_coef_age = -2000

# Generate target variable with some noise 
# Use when values cluster around a mean:(random.normal)
noise = np.random.normal(0, 30, n_sample)
price = (true_intercept +
         true_coef_size * size +
         true_coef_bedrooms * bedrooms +
         true_coef_age * age 
         + noise
        )