# 📊 Multiple Linear Regression

## 🔍 What is It?

**Multiple Linear Regression** (MLR) is an extension of simple linear regression. It models the relationship between one **dependent variable** and **two or more independent variables**.

---

## 🧮 Model Equation

The MLR equation is:

$$
y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p + \epsilon
$$

Where:
- $y$ = target/dependent variable  
- $x_1, x_2, \dots, x_p$ = independent variables (features)  
- $\beta_0$ = intercept  
- $\beta_1, \dots, \beta_p$ = coefficients (slopes)  
- $\epsilon$ = error term (residuals)

---

## 🎯 Goal

Estimate coefficients $\beta_0, \beta_1, \dots, \beta_p$ such that the model **minimizes the error** between predicted and actual values using **Ordinary Least Squares (OLS)**.

---

## 🧠 Matrix Form

For compact representation, we write it as:

$$
\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}
$$

Where:
- $\mathbf{y}$ is an $n \times 1$ vector of outputs  
- $\mathbf{X}$ is an $n \times (p + 1)$ matrix of inputs (with a column of 1s for intercept)  
- $\boldsymbol{\beta}$ is a $(p + 1) \times 1$ vector of coefficients  
- $\boldsymbol{\epsilon}$ is an $n \times 1$ error vector

---

## 🧮 Coefficient Estimation (Normal Equation)

To find the optimal $\boldsymbol{\beta}$ that minimizes squared errors:

$$
\boldsymbol{\hat{\beta}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y}
$$

This works if $\mathbf{X}^T \mathbf{X}$ is invertible.

---

## 📊 Assumptions

1. **Linearity**: Relationship between dependent and independent variables is linear  
2. **No multicollinearity**: Independent variables are not highly correlated  
3. **Homoscedasticity**: Constant variance of errors  
4. **Independence**: Errors are independent  
5. **Normality**: Errors are normally distributed (for inference)

---

## 🔍 Interpretation

- Each $\beta_j$ shows the **change in $y$** for a **unit change in $x_j$**, keeping other variables constant.
- The **sign** and **magnitude** of $\beta_j$ help in feature importance understanding.

---

## 🧪 Evaluation Metrics

- $R^2$: Explained variance  
- Adjusted $R^2$: Penalizes extra features  
- MAE, MSE, RMSE: Error-based metrics

---

## 🛠️ Python Example (Scikit-learn)

```python
from sklearn.linear_model import LinearRegression
import numpy as np

# Sample data
X = np.array([[1, 2], [2, 3], [4, 5]])
y = np.array([3, 5, 9])

# Model
model = LinearRegression()
model.fit(X, y)

print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)
print("R^2 Score:", model.score(X, y))


In [1]:
from sklearn.linear_model import LinearRegression
import numpy as np

# Sample data
X = np.array([[1, 2], [2, 3], [4, 5]])
y = np.array([3, 5, 9])

# Model
model = LinearRegression()
model.fit(X, y)

print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)
print("R^2 Score:", model.score(X, y))

Coefficients: [1. 1.]
Intercept: -8.881784197001252e-16
R^2 Score: 1.0


At the end, multiple linear regression is about finding the plane that is closest to all the points

In [1]:
from sklearn.datasets import make_regression
import pandas as pd
import numpy as np

import plotly.express as px
import plotly.graph_objects as go

from sklearn.metrics import mean_absolute_error,mean_squared_error,r2_score

In [2]:
# generate the data
X,y = make_regression(n_samples=100, n_features=2, 
n_informative=2, n_targets=1, noise=50)

In [3]:
df = pd.DataFrame({'feature1':X[:,0],'feature2':X[:,1],'target':y})

In [4]:
df.shape

(100, 3)

In [5]:
df.sample(3)

Unnamed: 0,feature1,feature2,target
42,-0.264969,-1.666308,-174.40918
71,1.315768,-0.710531,-2.041935
88,1.499555,0.092686,104.457501


In [6]:
fig = px.scatter_3d(df, x='feature1', y='feature2', z='target')

fig.show()

In [8]:
from sklearn.linear_model import LinearRegression

In [17]:
from sklearn.model_selection import train_test_split

In [18]:
X_train, X_test, y_train, y_test=train_test_split(X,y,test_size=0.2)

In [19]:
lr=LinearRegression()

In [20]:
lr.fit(X_train,y_train)

In [21]:
y_pred=lr.predict(X_test)

In [23]:
print("MSE: ", mean_squared_error(y_pred,y_test))

MSE:  3486.9056015384385


In [24]:
print("MAE: ", mean_absolute_error(y_pred,y_test))

MAE:  48.75171708396895
