### Final Credit Risk Assessment

Once we have developed all models - PD, LGD and EAD - we can compute EL (Expected Loss) for the whole database.

#### Packages

In [1]:
import os
import dill
import numpy as np
import pandas as pd

#### Importing data

In [2]:
df = pd.read_csv("../data/loan_data.csv")

  exec(code_obj, self.user_global_ns, self.user_ns)


#### Importing model binaries

In [3]:
ARTIFACTS_PATH = "../models/artifacts/"
artifacts = {}

for artifact in os.listdir(ARTIFACTS_PATH):
    artifact_name = artifact.split(".")[0]
    with open(os.path.join(ARTIFACTS_PATH, artifact), "rb") as file:
        artifacts[artifact_name] = dill.load(file)

print(artifacts.keys())

dict_keys(['ead_model', 'pd_preprocessing', 'pd_model', 'lgd_preprocessing', 'ead_preprocessing', 'cleaner', 'lgd_model'])


#### General data cleaning

In [4]:
df["earliest_cr_line"] = df["earliest_cr_line"].fillna(df["issue_d"])
df = artifacts["cleaner"].transform(df)

#### PD Model

In [5]:
df["PD"] = artifacts["pd_model"].predict_proba(artifacts["pd_preprocessing"].transform(df))[:, 1]

#### LGD Model

In [6]:
df["LGD"] = 1.0 - np.clip(
    artifacts["lgd_model"].predict(
        artifacts["lgd_preprocessing"].transform(df).astype(float)
    ), 0, 1
)

#### EAD Model

In [7]:
df["EAD"] = np.clip(artifacts["ead_model"].predict(
    artifacts["ead_preprocessing"].transform(df).astype(float)
), 0, 1)

#### Estimating Expected Loss

In [8]:
df["EL"] = df["PD"] * df["LGD"] * df["EAD"] * df["funded_amnt"]

In [9]:
expected_loss = df["EL"].sum() / 1e9
total_funded_amnt = df["funded_amnt"].sum() / 1e9
print(f"Total EL: ${expected_loss:.1f} bn")
print(f"{100 * expected_loss / total_funded_amnt:.2f}%")

Total EL: $2.2 bn
32.59%
