---
title: "Loan Eligibility Prediction - Results"
author: "Justin, Ali, Tanav, Gurleen"
jupyter: loan-analysis
execute:
  echo: false
  warning: false
editor: source
---

In [None]:
# Imports

import pandas as pd
import pickle
from IPython.display import Markdown, display
from tabulate import tabulate

# Results & Discussion

In [None]:
# Classification Report
classification_report = pd.read_csv("../results/tables/classification_report.csv")
classification_report.rename(columns={'Unnamed: 0': 'metric'}, inplace=True)

# Cross-validation results
cross_validation_results = pd.read_csv("../results/tables/cross_validation_results.csv")

test_scores = pd.read_csv("../results/tables/test_scores.csv")

confusion_matrix_csv = pd.read_csv("../results/tables/confusion_matrix.csv")

## Model Testing

## Accuracy Score

## Confusion Matrix

![Confusion Matrix](../results/figures/confusion_matrix.png){#fig-confusion_matrix_png width=100%}

Based on the confusion matrix, we can derive several key performance metrics for our loan eligibility prediction model. The model correctly predicted `{python} confusion_matrix_csv.iloc[0, 1]` loan denials and `{python} confusion_matrix_csv.iloc[1, 2]` loan approvals. The model incorrectly predicted `{python} confusion_matrix_csv.iloc[0, 2]` loan denials as approvals (false positives) and `{python} confusion_matrix_csv.iloc[1, 1]` loan approvals as denials (false negatives). From this, we can see that the model is better at predicting approvals than denials. Depending on the business context, we may prefer to minimize the false negatives to avoid losing out on potential customers who would have been approved for loans. But, we also need to consider the cost of false positives and minimizing the number of bad loans approved. 

## Classification Report

In [None]:
Markdown(classification_report.to_markdown())

From the classification report, we can see that the false precision is `{python} round(classification_report.iloc[0, 1], 2)`. Meaning that when the model predicts a loan will be denied, it is correct `{python} round(classification_report.iloc[0, 1], 2) * 100`% of the time. The recall is 0.47, meaning that the model correctly identifies 47% of all actual loan denials. The true precision is 0.80, meanin gthat when the model predicts a loan will be apporved, it is correct 80% of the time. The recall is 0.95, meaning that the model correctly identifies 95% of all actual loan approvals. From the weighted average column, we can see the overall accuracy of the model is 0.80, meaning that the model correctly predicts loan eligibility 80% of the time.

## Precision Recall Curve

![Precision-Recall Curve](../results/figures/precision_recall_curve.png){#fig-precision_recall width=100%}

From the precision-recall curve, we can see that as we increase the recall, the precision remains high. The AP score is 0.87. This indicates good performance of the model in identifying loan approvals. 

## ROC Curve

![ROC Curve](../results/figures/roc_curve.png){#fig-roc width=100%}

We can see from the ROC curve and the AUC score of 0.80 that the model has good ability to distinguish between approved and denied loan applications. The score suggests that while the model is effective at distinguishing between the two classes, there is still room for improvement for its predictive performance.

## References