# MULTIPLE LINEAR REGRESSION (MLR)

Regression with two or more independent variables.

### Goal

To find the best linear combination of inputs to predict Y.

How parameters are calculated?

Using matrix form:

Equation: ```  β=(((X^ T) X)^ −1) * (X^ T) * Y ```

X = matrix of features

Y = column vector of outputs

β = coefficients


### Correlation vs Regression

Correlation → Measures strength of relationship.

Regression → Predicts Y.

### Multicollinearity

When independent variables are highly correlated with each other.

#### Problems:

Coefficients become unstable

Model accuracy reduces

VIF > 10 indicates multicollinearity

```
Solution:

Remove correlated features

Use PCA

Use Lasso regression

```

### Dummy Variables

Used for categorical features:

Example: Gender → Male, Female
Convert to:

Male = 1

Female = 0

###  Feature Scaling

Important for numerical stability.

Two common types:

Standardization: (X - mean) / std

MinMax Scaling: (X - min) / (max - min)


### Performance Metrics for MLR

Same as SLR + more:

Adjusted R² 


Adjusted  ```  R2 = 1−(1−R^2)*  (n−1 / n−k−1) ```


Accounts for number of features.

#### p-values

Tell if a feature is useful:

p < 0.05 → significant

p > 0.05 → remove the feature

#### F-statistic

Tests overall model significance.




In [None]:
from sklearn.linear_model import LinearRegression
import numpy as np

X = np.array([
    [1, 2],
    [2, 1],
    [3, 4],
    [4, 3],
    [5, 5]
])

# Target (Y)
Y = np.array([5, 6, 7, 10, 11])

# Create model
model = LinearRegression()

# Train model
model.fit(X, Y)

print("Coefficients (b1, b2):", model.coef_)  # Slopes
print("Intercept (b0):", model.intercept_)

# Predict for new values
new_data = np.array([[6, 2]])
prediction = model.predict(new_data)
print("Prediction:", prediction)


Coefficients (b1, b2): [ 1.77777778 -0.22222222]
Intercept (b0): 3.1333333333333346
Prediction: [13.35555556]


In [2]:
import numpy as np

# Dataset
X = np.array([
    [1, 2],
    [2, 1],
    [3, 4],
    [4, 3],
    [5, 5]
])

Y = np.array([5, 6, 7, 10, 11]).reshape(-1, 1)

# Add bias column (intercept)
X_b = np.c_[np.ones((X.shape[0], 1)), X]

# Normal Equation
beta = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(Y)

print("Intercept (b0):", beta[0][0])
print("Coefficients (b1, b2):", beta[1:].flatten())

# Predict
new_data = np.array([1, 6, 2]).reshape(1, -1)  # 1 (for intercept), x1=6, x2=2
prediction = new_data.dot(beta)
print("Prediction:", prediction)


Intercept (b0): 3.133333333333321
Coefficients (b1, b2): [ 1.77777778 -0.22222222]
Prediction: [[13.35555556]]
