// ...existing code...
# Model building with multiple predictors

Model Building with Multiple Predictors in Multiple Linear Regression (MLR)

Multiple Linear Regression (MLR) predicts a continuous outcome using two or more predictors. The model quantifies each predictor's contribution while holding others constant.

## 1. Model equation

For predictors X1, X2, ..., Xk:

Y = β0 + β1 X1 + β2 X2 + ... + βk Xk + ϵ

- Y: dependent variable  
- β0: intercept  
- βi: slope coefficients (effect of Xi on Y, holding other predictors constant)  
- ϵ: error term

## 2. Steps to build an MLR model

1. Define the objective (e.g., predict house price, salary).  
2. Select potential predictors: numerical, categorical, derived features.  
3. Exploratory Data Analysis (EDA): distributions, outliers, correlations, missing values, scaling.  
4. Encode categorical variables: one-hot for nominal, ordinal encoding for ordered categories.  
5. Fit the model.

Example (Python):
```python
import pandas as pd
import statsmodels.api as sm

df = pd.read_csv("data.csv")
X = df[['age','income','education_level']]
X = sm.add_constant(X)
y = df['salary']

model = sm.OLS(y, X).fit()
print(model.summary())
```

## 3. Model evaluation

- R²: proportion of variance explained by predictors  
- Adjusted R²: R² penalized for unnecessary predictors  
- p-values: test significance of individual predictors  
- F-statistic: overall model significance  
- Residual diagnostics: linearity, homoscedasticity, normality, outliers, influential points

## 4. Handling multiple predictors

- Feature selection: forward selection, backward elimination, stepwise selection  
- Regularization: LASSO, Ridge, Elastic Net

## 5. Multicollinearity

Highly correlated predictors can destabilize coefficients. Use:
- Correlation heatmap  
- Variance Inflation Factor (VIF)

Example:
```python
from statsmodels.stats.outliers_influence import variance_inflation_factor
vif = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
# VIF > 10 indicates serious multicollinearity
```

## 6. Model refinement

- Remove insignificant or highly correlated predictors  
- Add polynomial or interaction terms if needed

## 7. Final interpretation

Each coefficient explains how much Y changes when a predictor changes, holding other variables constant.  
Example: β2 = 3.5 for income → “Salary increases by 3.5 units for every 1-unit increase in income, holding other variables constant.”
// ...existing code...