# **5. Experiments and Results**

# 5.1.1 Return Model 1 Expermients

# 5.1.2 Return ANN Model Experiments

# 5.1.3 Default XGBoost Experiments

### Overfitting Assessment

To evaluate model stability, ROC AUC scores were compared across training, test, and cross-validation sets:

| Model                     | Train AUC | Test AUC | 3-Fold CV AUC (Train) |
|----------------------------|-----------|----------|-----------------------|
| Baseline XGBoost           | 0.7285    | 0.7278   | 0.7262 ± 0.0010        |
| Tuned XGBoost (final)      | 0.7444    | 0.7317   | 0.7295 ± 0.0011        |

Both models showed minimal overfitting, with consistent performance between training and testing.


### Learning Curve Analysis

The learning curve shows the evolution of model performance as the training set size increases:

| Metric            | Trend                                                    | Insight                                         |
|-------------------|-----------------------------------------------------------|-------------------------------------------------|
| Training AUC      | Started high (~0.84), gradually declined to ~0.75          | Indicates initial overfitting, stabilising with more data |
| Validation AUC    | Steadily increased from ~0.718 to ~0.729                  | Shows improved generalisation over time        |
| Gap Between Curves| Narrowed as training progressed                           | Moderate, manageable overfitting               |

Overall, the model generalizes effectively, though additional data offers diminishing returns beyond a certain point.

### Feature Importance Analysis

Using gain-based feature importance scores from the tuned XGBoost model, the top predictors of loan default were identified based on their contribution to model performance:

| Rank | Feature                    | Key Insight                                                                                   |
|-----:|-----------------------------|----------------------------------------------------------------------------------------------|
| 1    | `int_rate`                  | Higher interest rates strongly correlate with higher credit risk and default probability.   |
| 2    | `grade_B`                   | Credit grade linked to borrower risk profiling and expected repayment behavior.             |
| 3    | `term_60 months`             | Longer loan terms increase risk due to extended borrower exposure over time.                 |
| 4    | `home_ownership_MORTGAGE`    | Mortgage status may signal higher financial obligations compared to outright ownership.      |
| 5    | `home_ownership_RENT`        | Renters may have less financial stability, increasing default risk.                         |
| 6    | `emp_length_Unknown`         | Missing employment length data may suggest weaker credit profiles.                          |
| 7    | `grade_C`                    | Lower credit grades (C, D, E) are associated with higher default probability.                |
| 8    | `grade_E`                    | Similar to grade_C, indicating higher credit risk borrowers.                                |
| 9    | `fico_range_high`            | Lower FICO ranges are tied to higher default risk.                                           |
| 10   | `grade_D`                    | Further reinforces the risk link between lower grades and default likelihood.               |
| 11   | `dti`                        | Higher debt-to-income ratios indicate greater repayment burden and risk.                    |
| 12   | `loan_amnt`                  | Larger loan amounts correlate with increased financial strain and default likelihood.       |
| 13   | `verification_status_Verified` | Verified income status affects borrower credibility and repayment assessment.             |
| 14   | `purpose_small_business`     | Loans for small businesses carry elevated risk due to business volatility.                  |
| 15   | `installment`                | Monthly repayment amounts impact borrower affordability and repayment behavior.             |

While features with lower gain-based importance contributed less to predictive performance in this model, they remain relevant indicators of borrower behavior and credit risk in real-world lending contexts.

# 5.1.4 Default ANN Model Experiments

# 5.2.1 Detailed Comparative Analysis of Return Models

# 5.2.2 Detailed Comparative Analysis of Default Models 

### Table of Model Evaluation Metric Comparisions

| Model                   | Accuracy | Precision (Class 1) | Recall (Class 1) | F1 Score (Class 1) | ROC AUC |
|--------------------------|----------|---------------------|------------------|--------------------|---------|
| ANN (Thresh 0.45)         | 0.6680   | 0.3660              | 0.7760           | 0.4980             | 0.7750  |
| XGB (Thresh 0.30)         | 0.6607   | 0.2626              | 0.9256           | 0.4091             | 0.7317  |
| XGB (Thresh 0.50)         | 0.6607   | 0.3463              | 0.6815           | 0.4592             | 0.7317  |
| XGB (Thresh 0.70)         | 0.6607   | 0.4886              | 0.2862           | 0.3610             | 0.7317  |
| RF (Baseline)             | 0.8030   | 0.5800              | 0.1700           | 0.2600             | 0.7760  |
| RF (Class Weighted)       | 0.8020   | 0.5900              | 0.1500           | 0.2400             | 0.7790  |
| RF (Thresh 0.2)           | 0.8020   | 0.3400              | 0.8100           | 0.4800             | 0.7790  |
| RF Weighted + Thresh 0.2  | 0.8020   | 0.3500              | 0.7900           | 0.4900             | 0.7790  |


The models offer complementary strengths: ANN provides stable recall and AUC, XGBoost offers flexible threshold tuning for risk control, and Random Forest achieves the best F1 balance. To best meet the coursework goal of minimising defaults, XGBoost at a 0.30 threshold is recommended as the primary model.

| Goal                          | Best Model                  | Reasoning                                            |
|------------------------------- |----------------------------- |----------------------------------------------------- |
| Maximizing Recall              | XGBoost @ Threshold 0.30     | Catching nearly all positives (recall = 0.93)        |
| Maximizing Precision           | XGBoost @ Threshold 0.70     | Highest precision (0.49) with controlled recall      |
| Balanced F1 (Fair Trade-off)    | RF Weighted + Threshold      | Best F1 (0.49), strong recall and decent precision   |
| Overall AUC Performance        | ANN or RF (any tuned)         | AUC (0.775-0.779), best class discrimination         |


# 5.3.1 High-Return Loan Portfolio Analysis: _________ 

# 5.3.2 Low-Risk Loan Portfolio Analysis: XGBoost 

### Loan Portfolio Segmentation Using XGBoost

Different threshold-based strategies for loan portfolio construction were explored. The following table summarises the key portfolio options:

| Strategy                        | Total Loans Selected | Observed Default Rate | Key Insight                                                 |
|----------------------------------|----------------------|-----------------------|-------------------------------------------------------------|
| Threshold-Based (< 0.2)          | 19,277               | 3.94%                 | Very low-risk portfolio ideal for minimising defaults.      |
| Top N Loans (Top 300 safest)     | 300                  | 1.33%                 | Ultra-low default rate but limited size and diversification.|
| Top X% Loans (Top 10%)           | 15,134               | 3.36%                 | Balances low risk with larger portfolio scale.              |

### Recommendations
- For clients prioritizing ultra-low risk, Top 300 safest portfolio is ideal.
- For those balancing risk and scale, Threshold < 0.2 portfolio provides a larger selection with still very low risk.
- Top 10% portfolios present a strong middle ground, offering both diversification and significantly reduced default risks.

# **6. Limitations** (Not finished, Incorporate everybody's limitations)

While the current analysis provides strong baseline results, limitations such as moderate overfitting, diminishing returns from additional data, and model interpretability constraints highlight areas for further improvement. Future work could focus on advanced feature engineering, hyperparameter optimization, or exploring alternative models (e.g., LightGBM, CatBoost, or interpretable models like Explainable Boosting Machines) to enhance predictive performance and transparency.

# **7. Conclusion** 

This coursework successfully developed two investment strategies using the Lending Club dataset: one focused on maximising returns and the other on minimising default risk.

For the return-maximisation strategy, a custom definition of return was constructed based on realised cash flows relative to the initial loan amount, ensuring a realistic and transparent measure of investment performance tailored to the lending context. Tree-based models and artificial neural networks (ANNs) were trained to predict high-return loans, with extensive data preparation steps including one-hot encoding of categorical variables and feature scaling to enhance ANN stability and learning performance. The target variable was binarized based on profitability, creating a clear prediction task aligned with investment goals. Hyperparameter tuning was conducted using Keras Tuner with Random Search, exploring different neuron counts to optimise model architecture. The final ANN model, selected based on minimising validation mean squared error, effectively captured complex patterns in the data and supported portfolio selection focused on maximising profitability while managing exposure to concentrated risks.

For the default-minimisation strategy, predictive models — specifically XGBoost and ANN — were employed to accurately identify low-risk borrowers. Threshold adjustment techniques were applied to refine borrower segmentation, allowing for the creation of portfolios with significantly reduced observed default rates. Through the use of threshold tuning, feature importance analysis, and portfolio simulation, the approach successfully combined technical model performance with practical, investor-oriented decision-making.

Across both strategies, model performance was carefully assessed using a range of evaluation metrics including AUC, F1 score, precision, and recall, while learning curve and overfitting assessments confirmed model stability and generalisability. The analysis further ensured that selected portfolios not only performed well on training data but also maintained robustness when evaluated on independent test sets.

Ultimately, the project demonstrated a structured, end-to-end application of predictive modelling to real-world investment decision-making, aligning technical modelling practices with practical investment objectives. By integrating machine learning techniques with financial reasoning, the project highlights how data-driven approaches can effectively support strategic loan investment decisions in peer-to-peer lending markets.