# Problem Statement:

This data set includes customers who have paid off their loans, who have been past due and put into collection without paying back their loan and interests, and who have paid off only after they were put in the collection. The financial product is a bullet loan that customers should pay off all of their loan debt in just one time by the end of the term, instead of an installment schedule. Of course, they could pay off earlier than their pay schedule.

### Attribute information:

- Loan_status: Whether a loan is paid off, in the collection, new customer yet to pay off, or paid off after the collection efforts

- Principal: Basic principal loan amount at the origination

- terms: Can be weekly (7 days), biweekly, and monthly payoff schedule

- Age, education, gender: A customer’s basic demographic information

### Importing the required libraries

In [9]:
# Libraries to help with reading and manipulating data
import pandas as pd
import numpy as np
# libaries to help with data visualization
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
# Library to encode the variables
from sklearn.preprocessing import OneHotEncoder
 # Library to split data
from sklearn.model_selection import train_test_split  
# library to import different optimizers
from tensorflow.keras import optimizers
# Library to import different loss functions 
from tensorflow.keras import losses
from tensorflow.keras.layers import Dense
# Library to avoid the warnings 
import warnings
warnings.filterwarnings('ignore')
# importing keras library
from tensorflow import keras
# library to convert the target variables to numpy arrays
from tensorflow.keras.utils import to_categorical
# library to plot classification report
from sklearn.metrics import classification_report
# library to import Batch Normalization
from tensorflow.keras.layers import BatchNormalization
# Library to import Dropout
from tensorflow.keras.layers import Dropout

### Q Import the dataset and answer the below question 
### What does the distribution of ‘Age’ attribute look like?
- Right-skewed
- Left-skewed
- Normally Distributed

In [6]:
# importing the data
data = pd.read_csv('Loan_payments_data.csv')

FileNotFoundError: [Errno 2] No such file or directory: 'Loan_payments_data.csv'

In [None]:
sns.displot(data=data, x='age', kde=True)

In [10]:
data.age.skew() 

0.7219702338351359

### Correct Answer: Right Skewed 

### Q Build a Neural Network Model on the dataset and Obtain accuracy by following the below steps:

- Store the Independent and Dependent features in X and y
- Use train_test split to split the data (80% for training and 20% for testing)
- Convert the target feature into a NumPy array using Keras to_categorical function


Use the below parameters mentioned and 

    - The number of neurons in the First and second layers is 64 and 32 respectively.
    - Use Dropout of ratio 0.2 after second layer.
    - Use ReLu as an Activation function in Hidden layers and Adam as an Optimizer with 1e-3 as learning rate
    - Build the model on 20 Epochs


- Note 

  - Do not use stratify sampling and Callbacks.
  - The given dataset is scaled, so please don't scale the data again. 
  
  
- `>`30 and `<`50
- `>`51 and `<`70
- `>`70 and `<`85
- `>`90 

In [11]:
#Store the Independent and Dependent features in X and y
X = data.drop('loan_status',axis=1)
Y = data[['loan_status']]

In [12]:
# Use train_test split to split the data (80% for training and 20% for testing)
X_train, X_test, y_train, y_test=train_test_split(X, Y, test_size=0.2, random_state=1)

In [13]:
# Convert the target feature into a NumPy array using Keras to_categorical function
y_train = to_categorical(y_train, 3)
y_test_cat = to_categorical(y_test, 3)

In [14]:
# Defining the model
model = keras.Sequential()
# Adding the input layer with 64 neurons with relu as an activation function with input shape 11
model.add(Dense(64, activation='relu',input_shape=(11,))) 
# Adding the first hidden layer with 32 neurons with relu as an activation function
model.add(Dense(32, activation='relu'))
# Adding dropout layer with ratio of 0.2 
model.add(Dropout(0.2))
# Defining the output layer with 3 neurons with softmax as an activation function
model.add(Dense(3, activation='softmax'))
# Defining the Adam Optimizers
adam = optimizers.Adam(lr=1e-3)
# Compiling the model with categorical crossentropy as loss function with accuracy as metrics
model.compile(loss=losses.categorical_crossentropy, optimizer=adam, metrics=['accuracy']) 
# Fitting the model on X_train and y_train with 20 epcohs with 20% of validation split
history=model.fit(X_train, y_train, epochs=20,  validation_split=0.2,  verbose=2)

Epoch 1/20
10/10 - 0s - loss: 68.3540 - accuracy: 0.3719 - val_loss: 38.5270 - val_accuracy: 0.6125 - 449ms/epoch - 45ms/step
Epoch 2/20
10/10 - 0s - loss: 43.5958 - accuracy: 0.5875 - val_loss: 25.4263 - val_accuracy: 0.6125 - 32ms/epoch - 3ms/step
Epoch 3/20
10/10 - 0s - loss: 28.0714 - accuracy: 0.4563 - val_loss: 9.5784 - val_accuracy: 0.6125 - 32ms/epoch - 3ms/step
Epoch 4/20
10/10 - 0s - loss: 16.2916 - accuracy: 0.4875 - val_loss: 8.6752 - val_accuracy: 0.6125 - 32ms/epoch - 3ms/step
Epoch 5/20
10/10 - 0s - loss: 10.7863 - accuracy: 0.4219 - val_loss: 5.4323 - val_accuracy: 0.6000 - 32ms/epoch - 3ms/step
Epoch 6/20
10/10 - 0s - loss: 8.7959 - accuracy: 0.4719 - val_loss: 6.5111 - val_accuracy: 0.6125 - 32ms/epoch - 3ms/step
Epoch 7/20
10/10 - 0s - loss: 6.7043 - accuracy: 0.4313 - val_loss: 2.0239 - val_accuracy: 0.5875 - 32ms/epoch - 3ms/step
Epoch 8/20
10/10 - 0s - loss: 4.6589 - accuracy: 0.4563 - val_loss: 2.3471 - val_accuracy: 0.6000 - 32ms/epoch - 3ms/step
Epoch 9/20
10/1

### Correct Answer:  >51 and <70

### Q Build a model on the data using the below hyperparameters and find the f1-score of 0th class .

- The number of neurons in the first, second, third, and fourth layers should be 256,124,64 and 32 respectively.
- Use the BatchNormalization after second layer.
- Use ReLu as an Activation function in Hidden layers and RMSprop as Optimizer with 1e-3 as learning rate
- Build the model on 50 Epochs



- Note

    - Do not use stratify sampling and Callbacks.
    - The given dataset is scaled, so please don't scale the data again.
 
 
- 0.51 - 0.55
- 0.71 - 0.80
- 0.60 - 0.70
- 0.35 - 0.50 

In [15]:
# Defining the model
model_1 = keras.Sequential()
# Adding the input layer with 256 neurons with relu as an activation function with input shape 11
model_1.add(Dense(256, activation='relu',input_shape=(11,)))
# Adding the first hidden layer with 124 neurons with relu as an activation function
model_1.add(Dense(124, activation='relu'))
# Adding Batch Normalization
model_1.add(BatchNormalization())
# Adding the second hidden layer with 64 neurons with relu as an activation function
model_1.add(Dense(64, activation='relu'))
# Adding the third hidden layer with 32 neurons with relu as an activation function
model_1.add(Dense(32, activation='relu'))
# Defining the output layer with 3 neurons with softmax as an activation function
model_1.add(Dense(3, activation='softmax'))
# Defining the RMSprop Optimizers
RMSprop = optimizers.RMSprop(lr=1e-3)
# Compiling the model with categorical crossentropy as loss function with accuracy as metrics
model_1.compile(loss=losses.categorical_crossentropy, optimizer=RMSprop, metrics=['accuracy']) 
# Fitting the model on X_train and y_train with 50 epcohs with 20% of validation split
history_1=model_1.fit(X_train, y_train, validation_split=0.2, epochs=50, verbose=2)

Epoch 1/50
10/10 - 1s - loss: 1.0853 - accuracy: 0.4969 - val_loss: 17.2261 - val_accuracy: 0.6125 - 649ms/epoch - 65ms/step
Epoch 2/50
10/10 - 0s - loss: 0.9499 - accuracy: 0.6250 - val_loss: 13.2569 - val_accuracy: 0.6125 - 53ms/epoch - 5ms/step
Epoch 3/50
10/10 - 0s - loss: 0.9408 - accuracy: 0.6250 - val_loss: 8.8035 - val_accuracy: 0.6125 - 43ms/epoch - 4ms/step
Epoch 4/50
10/10 - 0s - loss: 0.9483 - accuracy: 0.6250 - val_loss: 6.5334 - val_accuracy: 0.6125 - 40ms/epoch - 4ms/step
Epoch 5/50
10/10 - 0s - loss: 0.9416 - accuracy: 0.6250 - val_loss: 5.3939 - val_accuracy: 0.6125 - 40ms/epoch - 4ms/step
Epoch 6/50
10/10 - 0s - loss: 0.9281 - accuracy: 0.6250 - val_loss: 4.7476 - val_accuracy: 0.6125 - 41ms/epoch - 4ms/step
Epoch 7/50
10/10 - 0s - loss: 0.9396 - accuracy: 0.6250 - val_loss: 3.6360 - val_accuracy: 0.6125 - 33ms/epoch - 3ms/step
Epoch 8/50
10/10 - 0s - loss: 0.9450 - accuracy: 0.6250 - val_loss: 3.1617 - val_accuracy: 0.6125 - 40ms/epoch - 4ms/step
Epoch 9/50
10/10 - 0

In [16]:
# Predicting on Test data 
y_pred=model_1.predict(X_test)

In [17]:
# Applying the argmax function
y_pred_final=[]
for i in y_pred:
    y_pred_final.append(np.argmax(i))

In [18]:
# Classification report 
print(classification_report(y_test,y_pred_final))

              precision    recall  f1-score   support

           0       0.51      1.00      0.68        51
           1       0.00      0.00      0.00        23
           2       0.00      0.00      0.00        26

    accuracy                           0.51       100
   macro avg       0.17      0.33      0.23       100
weighted avg       0.26      0.51      0.34       100



### Correct Answer:  0.60 - 0.70

### Q Build a model on the data using the below hyperparameters and find the precision of 0th class .

- The number of neurons in the first, second, third, and fourth layers should be 128,64,64 and 32 respectively.
- Use the Dropout of ratio 0.3 after second layer and BatchNormation after third layer. 
- Use ReLu as an Activation function in Hidden layers and Adam as Optimizer with 1e-3 as learning rate
- Build the model on 100 Epochs


- Note 

   - Do not use stratify sampling and Callbacks.
   - The given dataset is scaled, so please don't scale the data again.

- 0.20 - 0.30 
- 0.31 - 0.60 
- 0.61 - 0.75
- `>`0.80

In [19]:
# Defining the model
model_2 = keras.Sequential()
# Adding the input layer with 128 neurons with relu as an activation function with input shape 11
model_2.add(Dense(128, activation='relu',kernel_initializer='he_uniform',input_shape=(11,)))
# Adding the first hidden layer with 124 neurons with relu as an activation function
model_2.add(Dense(64, activation='relu',kernel_initializer='he_uniform'))
# Adding Dropout layer with a ratio of 0.3
model_2.add(Dropout(0.3))
# Adding the second hidden layer with 64 neurons with relu as an activation function
model_2.add(Dense(64, activation='relu',kernel_initializer='he_uniform'))
# Applying Batch Normalization
model_2.add(BatchNormalization())
# Adding the third hidden layer with 64 neurons with relu as an activation function
model_2.add(Dense(32, activation='relu',kernel_initializer='he_uniform'))
# Defining the output layer with 3 neurons with softmax as an activation function
model_2.add(Dense(3, activation='softmax'))
# Defining the Adam Optimizers
adam = optimizers.Adam(lr=1e-3)
# Compiling the model with categorical crossentropy as loss function with accuracy as metrics
model_2.compile(loss=losses.categorical_crossentropy, optimizer=adam, metrics=['accuracy']) 
# Fitting the model on X_train and y_train with 100 epcohs with 20% of validation split
history_2=model_2.fit(X_train, y_train, validation_split=0.2, epochs=100, batch_size=128, verbose=2)

Epoch 1/100
3/3 - 1s - loss: 1.3490 - accuracy: 0.4062 - val_loss: 3.3828 - val_accuracy: 0.6125 - 501ms/epoch - 167ms/step
Epoch 2/100
3/3 - 0s - loss: 1.3699 - accuracy: 0.4125 - val_loss: 2.2976 - val_accuracy: 0.6125 - 33ms/epoch - 11ms/step
Epoch 3/100
3/3 - 0s - loss: 1.1489 - accuracy: 0.5375 - val_loss: 1.8980 - val_accuracy: 0.4125 - 32ms/epoch - 11ms/step
Epoch 4/100
3/3 - 0s - loss: 1.1693 - accuracy: 0.5031 - val_loss: 1.7188 - val_accuracy: 0.2500 - 28ms/epoch - 9ms/step
Epoch 5/100
3/3 - 0s - loss: 1.0971 - accuracy: 0.5375 - val_loss: 1.6260 - val_accuracy: 0.2500 - 24ms/epoch - 8ms/step
Epoch 6/100
3/3 - 0s - loss: 1.1431 - accuracy: 0.5437 - val_loss: 1.5548 - val_accuracy: 0.2500 - 25ms/epoch - 8ms/step
Epoch 7/100
3/3 - 0s - loss: 1.0899 - accuracy: 0.5656 - val_loss: 1.5874 - val_accuracy: 0.2500 - 32ms/epoch - 11ms/step
Epoch 8/100
3/3 - 0s - loss: 1.0880 - accuracy: 0.5719 - val_loss: 1.6768 - val_accuracy: 0.2500 - 32ms/epoch - 11ms/step
Epoch 9/100
3/3 - 0s - lo

Epoch 69/100
3/3 - 0s - loss: 0.9494 - accuracy: 0.6187 - val_loss: 0.9557 - val_accuracy: 0.6125 - 31ms/epoch - 10ms/step
Epoch 70/100
3/3 - 0s - loss: 0.9257 - accuracy: 0.6187 - val_loss: 0.9541 - val_accuracy: 0.6125 - 24ms/epoch - 8ms/step
Epoch 71/100
3/3 - 0s - loss: 0.9541 - accuracy: 0.6313 - val_loss: 0.9499 - val_accuracy: 0.6125 - 16ms/epoch - 5ms/step
Epoch 72/100
3/3 - 0s - loss: 0.9510 - accuracy: 0.6187 - val_loss: 0.9463 - val_accuracy: 0.6125 - 22ms/epoch - 7ms/step
Epoch 73/100
3/3 - 0s - loss: 0.9256 - accuracy: 0.6219 - val_loss: 0.9425 - val_accuracy: 0.6125 - 29ms/epoch - 10ms/step
Epoch 74/100
3/3 - 0s - loss: 0.9513 - accuracy: 0.6187 - val_loss: 0.9420 - val_accuracy: 0.6125 - 25ms/epoch - 8ms/step
Epoch 75/100
3/3 - 0s - loss: 0.9358 - accuracy: 0.6219 - val_loss: 0.9412 - val_accuracy: 0.6125 - 18ms/epoch - 6ms/step
Epoch 76/100
3/3 - 0s - loss: 0.9316 - accuracy: 0.6281 - val_loss: 0.9410 - val_accuracy: 0.6125 - 24ms/epoch - 8ms/step
Epoch 77/100
3/3 - 0s 

In [20]:
# predicting on test data
y_pred_2=model_2.predict(X_test)

In [21]:
# Applying argmax function
y_pred_final_2=[]
for i in y_pred_2:
    y_pred_final_2.append(np.argmax(i))

In [22]:
# Classification report
print(classification_report(y_test,y_pred_final_2))

              precision    recall  f1-score   support

           0       0.51      1.00      0.68        51
           1       0.00      0.00      0.00        23
           2       0.00      0.00      0.00        26

    accuracy                           0.51       100
   macro avg       0.17      0.33      0.23       100
weighted avg       0.26      0.51      0.34       100



### Correct Answer: 0.31 - 0.60 