# 📘 Notebook 2: Derive Linear Regression from First Principles
In this notebook, we derive linear regression step-by-step from first principles — using no black boxes. We build intuition, develop the math, and connect it to real-world modeling.

## Goals
- Understand what linear regression is really doing
- Derive the closed-form solution
- Visualize the geometry of projection
- Implement regression from scratch, explain every function used


## ✍️ Step 1: Simulate Simple Linear Data

In [None]:
# Simulate data: y = 3x + noise
import numpy as np
import plotly.express as px
np.random.seed(0)
X = np.linspace(0, 10, 50).reshape(-1, 1)
true_coef = 3
noise = np.random.normal(0, 2, X.shape[0])
y = true_coef * X.flatten() + noise

# Visualize
px.scatter(x=X.flatten(), y=y, labels={'x':'x', 'y':'y'}, title='Simulated Linear Data with Noise')

## 📐 Step 2: Define the Model and Loss Function
We assume a model of the form:
$$ y = X\beta + \varepsilon $$
Where:
- $X$ is the matrix of input features
- $\beta$ is the vector of parameters
- $\varepsilon$ is the error term

We define our loss function as the sum of squared residuals:
$$ L(\beta) = \|y - X\beta\|^2 $$

## 🧮 Step 3: Derive the Closed Form Solution
To minimize $L(\beta)$, we take the derivative and set it to zero:
$$ \frac{\partial}{\partial \beta} \|y - X\beta\|^2 = -2X^T(y - X\beta) = 0 $$
Solving:
$$ X^TX\beta = X^Ty $$
Assuming $X^TX$ is invertible:
$$ \beta = (X^TX)^{-1}X^Ty $$

## 🧪 Step 4: Implement Linear Regression from Scratch

In [None]:
def linear_regression_closed_form(X: np.ndarray, y: np.ndarray) -> np.ndarray:
    """Solves for beta using the normal equation."""
    X_b = np.c_[np.ones((X.shape[0], 1)), X]  # Add bias term manually
    beta_hat = np.linalg.inv(X_b.T @ X_b) @ X_b.T @ y
    return beta_hat

beta = linear_regression_closed_form(X, y)
print(f"Estimated coefficients: Intercept = {beta[0]:.2f}, Slope = {beta[1]:.2f}")

## 📊 Step 5: Visualize Predictions vs Reality

In [None]:
# Generate predictions
X_b = np.c_[np.ones((X.shape[0], 1)), X]
y_pred = X_b @ beta

# Plot
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(go.Scatter(x=X.flatten(), y=y, mode='markers', name='Observed'))
fig.add_trace(go.Scatter(x=X.flatten(), y=y_pred, mode='lines', name='Predicted'))
fig.update_layout(title='Observed vs Predicted', xaxis_title='x', yaxis_title='y')
fig.show()