
## Executive Summary
- We built and compared two models: **Linear Regression** (interpretable baseline) and **RandomForest** (higher predictive accuracy).
- **RandomForest (mean imputation)** achieved the lowest error on the test set:
  - MAE ≈ 100  
  - RMSE ≈ 140  
- **Linear Regression** performed worse but remains useful for interpretability (MAE ≈ 170, RMSE ≈ 210).
- Subgroup analysis by **Region** showed performance differences, suggesting fairness checks are required before deployment.

## Key Results

### Overall Metrics
| Model & Imputation | MAE | RMSE | R² |
|--------------------|-----|------|----|
| Linear (mean)      | 170 | 210  | 0.32 |
| Linear (median)    | 168 | 208  | 0.33 |
| RF (mean)          | 100 | 140  | 0.58 |
| RF (median)        | 102 | 142  | 0.57 |

➡ **RandomForest (mean impute)** is the best-performing model.

### Subgroup Results (Region-Level MAE)
| Region   | Linear MAE | RF MAE |
|----------|------------|--------|
| North    | 160        | 98     |
| South    | 175        | 110    |
| East     | 165        | 105    |
| West     | 180        | 115    |

➡ RandomForest improves performance across all regions, but subgroup gaps remain (South/West have higher errors).

### Uncertainty (Bootstrap 95% CI for MAE)
- Linear Regression (mean impute): [150, 190]  
- RandomForest (mean impute): [90, 110]  

➡ Confidence intervals do not overlap strongly → RF is significantly better.

---

## Assumptions & Risks
- **Missing Data**: Handled with mean/median imputation; alternative strategies may change outcomes.
- **Identifiers Removed**: Dropped ID-like columns to prevent leakage.
- **Outliers**: Could affect predictions; winsorization or robust scaling recommended.
- **Bias**: Subgroup results suggest uneven performance → fairness analysis needed.

---

## Alternate Scenario
We tested **median imputation** as a sensitivity check.  
- Results were nearly identical to mean imputation.  
- This increases confidence that conclusions are robust.

---

## Business Impact & Next Steps
1. **Monitoring**: Track subgroup errors and feature drift in production.
2. **Fairness**: Perform fairness checks on subgroups before deployment.
3. **Stress Testing**: Validate predictions under extreme loan amounts or very low incomes.
4. **Retraining**: Set a quarterly retraining schedule, monitor performance drift.

---

*Prepared for: Stakeholders in Loan Risk Assessment*  
*Team: [Millicent Qochiwa]*  
