## MIS780 - Artificial Intelligence for Business

## Week 4 - Part 2: Multi-layer Perceptron for Classification

In this notebook, we will perform prediction of default of credit card clients using Deep Learning models.


## Table of Content
   
   
1. [Preparation](#cell_Preparation)    
    
    
2. [Credit Card Client Data](#cell_Ames)


3. [Deep Learning with Sequential Model](#cell_deep)


<a id = "cell_Preparation"></a>
## 1. Preparation

Load some standard Python libraries.

In [None]:
from __future__ import print_function
import os
import math
import datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Next, load `Sklearn` and its wrappers

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.preprocessing import OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline


Some options to control Pandas display

In [None]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

<a id = "cell_Ames"></a>
## 2. Credit Card Client data

The data set for this exercise is `default_of_credit_card_clients.csv`, which can be accessed from Cloud Deakin. Description about this data set can be accessed from [Kaggle website](https://www.kaggle.com/datasets/mariosfish/default-of-credit-card-clients). The aim of this exercise is to predict the class value of `dpnm` column (`1` for default, and `0` for not-default)

In [None]:
credit_data_org = pd.read_csv("default_of_credit_card_clients.csv")
print('Number of records read: ', credit_data_org.size)

In [None]:
credit_data_org.head(10)

Find the column types and the number of missing values in each column

In [None]:
# Finding column types
credit_data_org.dtypes

In [None]:
# Check for missing values
missing = credit_data_org.isnull().sum()
missing = missing[missing > 0]
missing.sort_values(ascending=False)

In [None]:
#Remove the ID Column
credit_data_org = credit_data_org.drop('ID', axis=1)
credit_data_org.head()


Split data for training and validation. Split index ranges into three parts, however, ignore the third.

In [None]:
train_size, valid_size, test_size = (0.7, 0.3, 0.0)
credit_train, credit_valid = train_test_split(credit_data_org,
                                      test_size=valid_size,
                                      random_state=2020)

Extract data for training and validation into x and y vectors.

In [None]:
label_col = 'dpnm'

credit_y_train = credit_train[[label_col]]
credit_x_train = credit_train.drop(label_col, axis=1)
credit_y_valid = credit_valid[[label_col]]
credit_x_valid = credit_valid.drop(label_col, axis=1)

print('Size of training set: ', len(credit_x_train))
print('Size of validation set: ', len(credit_x_valid))

create a scaling model using training set and use it to scale both training and validation data.

In [None]:
scaler = MinMaxScaler(feature_range=(0, 1), copy=True).fit(credit_x_train)
credit_x_train = pd.DataFrame(scaler.transform(credit_x_train),
                            columns = credit_x_train.columns, index = credit_x_train.index)
credit_x_valid = pd.DataFrame(scaler.transform(credit_x_valid),
                            columns = credit_x_valid.columns, index = credit_x_valid.index)

print('X train min =', round(credit_x_train.min().min(),4), '; max =', round(credit_x_train.max().max(), 4))
print('X valid min =', round(credit_x_valid.min().min(),4), '; max =', round(credit_x_valid.max().max(), 4))

In [None]:
credit_x_valid.head(10)

<a id = "cell_deep"></a>
## 3. Deep Learning with Sequential Model

Load required libraries for Deep Learning with Sequential model.

In [None]:
import tensorflow as tf
from tensorflow.keras import metrics
from tensorflow.keras import regularizers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.optimizers import Nadam, RMSprop
from tensorflow.keras.losses import categorical_crossentropy

Convert pandas data frames to `np` arrays.

In [None]:
from tensorflow.keras.utils import to_categorical

arr_x_train = np.array(credit_x_train)
arr_y_train = np.array(credit_y_train)
arr_x_valid = np.array(credit_x_valid)
arr_y_valid = np.array(credit_y_valid)

# convert class vectors to binary class matrices
arr_y_train = to_categorical(arr_y_train, 2)
arr_y_valid = to_categorical(arr_y_valid, 2)

print('Train shape: x=', arr_x_train.shape, ', y=', arr_y_train.shape)
print('Test shape: x=', arr_x_valid.shape, ', y=', arr_y_valid.shape)

Create  **Keras model** for experiment purpose.


In [None]:
def basic_model_1():
    t_model = Sequential()
    t_model.add(Dense(100, activation="relu", input_shape=(23,)))
    t_model.add(Dense(2, activation='softmax'))
    t_model.summary()
    return(t_model)

Now we create the executable model using one of the above functions. Run below code until the end to obtain the result, then change `basic_model_1` to `basic_model_2` and run the code again. Compare the results generated by the two models.

In [None]:
model = basic_model_1()
model.summary()

Fit the model and record the history of training and validation.
As we specified `EarlyStopping` with `patience=20`, with luck the training will stop in less than 200 epochs.

In [None]:
model.compile(optimizer=Nadam(learning_rate=0.005),
              loss=categorical_crossentropy,
              metrics=['accuracy'])

history = model.fit(arr_x_train, arr_y_train,
    batch_size=64,
    epochs=100,
    shuffle=True,
    verbose=2,
    validation_data=(arr_x_valid, arr_y_valid))

Evaluate and report performance of the trained model

In [None]:
train_score = model.evaluate(arr_x_train, arr_y_train, verbose=0)
valid_score = model.evaluate(arr_x_valid, arr_y_valid, verbose=0)

print('Train Accuracy: ', round(train_score[1], 2), ', Train Loss: ', round(train_score[0], 2))
print('Val Accuracy: ', round(valid_score[1], 2), ', Val Loss: ', round(valid_score[0], 2))

In [None]:
from sklearn.metrics import classification_report
from sklearn.metrics import cohen_kappa_score


# Make predictions on the test set
y_pred = model.predict(arr_x_valid)

# Convert the predicted labels to continuous-multioutput format
y_pred_continuous = np.round(y_pred)

# Convert the predicted labels to multiclass format
y_pred_multiclass = np.argmax(y_pred, axis=1)
arr_y_valid = np.argmax(arr_y_valid, axis=1)

# Calculate the kappa score
kappa = cohen_kappa_score(arr_y_valid, y_pred_multiclass)
print("The result of Kappa is :", round(kappa, 3))

# Generate the classification report
report = classification_report(arr_y_valid, y_pred_multiclass)

# Print the report
print("The result of the classification report is: \n ",report)

In [None]:
import numpy as np
from sklearn.metrics import confusion_matrix
from sklearn.metrics import ConfusionMatrixDisplay


cm = confusion_matrix(
    arr_y_valid,
    y_pred_multiclass)

# Create a ConfusionMatrixDisplay object
display = ConfusionMatrixDisplay(
    confusion_matrix=cm)

# Create a figure with a fixed size
fig = plt.figure(figsize=(5, 5))

# Create a subplot within the figure
ax = fig.subplots()

# Plot the confusion matrix as a heatmap
display.plot(ax=ax)

# Show the plot
plt.show()

**Exercise**: Try to improve the prediction peformance of the model (e.g., create more complex models, oversampling the samples in minority class)