# Mutiple Linear Regression
### *Synthetic Examples for the Multivariate Linear Regressio*

In this notebook, I use synthetic data to demonstrate some computational applications of the linear regression model.

In [1]:
# libs
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## 1. Closed-Form Solution

In the theoretical notebook, I derive a closed-form solution for the MLR. Here, I demonstrate how to simply encode it with `NumPy`. Let us suppose that we have the following model:

$$
\mathbf{y} = \beta_0 + \beta_1 \mathbf{x_1} + \beta_2 \mathbf{x_2} + \epsilon
$$

where we know that $\beta_0 = 1, \beta_1 = 0.5$, and $\beta_2 = 1.3$, while $\epsilon \sim \mathcal{N}(0, 3).$ We generate synthetic data with this model:

In [11]:
# generate some data
np.random.seed(0)
n_samples = 1000000

# model with two features
x1 = np.random.rand(n_samples) * 10  # Feature 1 (scaled uniformly between 0 and 10)
x2 = np.random.rand(n_samples) * 20  # Feature 2 (scaled uniformly between 0 and 20)
noise = np.random.randn(n_samples) * 2  # Noise
y = 1 + 0.5 * x1 + 1.3 * x2 + noise  # Target variable

# matrix X
X = np.column_stack((np.ones(n_samples), x1, x2))

Once that we have the simulated data, we simply encode the close form solution we derived above and look how well it estimates the theoretical coefficients.

In [12]:
def ols_estimate(X, y):
    """Compute OLS estimates using the normal equation."""
    X_transpose = X.T
    beta_hat = np.linalg.inv(X_transpose @ X) @ X_transpose @ y
    return beta_hat

# estimate coefficients
beta_hat = ols_estimate(X, y)
print("Estimated coefficients:", beta_hat)

Estimated coefficients: [0.99860235 0.49944332 1.30022815]
