In this assignment, you are task to build a multilayer perceptron model. The following are the requirements:

    * Choose any dataset
    * Explain the problem you are trying to solve
    * Create your own model
    * Evaluate the accuracy of your model



**Choosing datasets**

link: https://www.kaggle.com/datasets/bhavikjikadara/loan-status-prediction

In [None]:
#mounting the google drive to google colab
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split

#reading the csv file
df = pd.read_csv('/content/drive/My Drive/Colab Notebooks/loan_data.csv')

In [None]:
df.head(10)

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001003,Male,Yes,1,Graduate,No,4583,1508.0,128.0,360.0,1.0,Rural,N
1,LP001005,Male,Yes,0,Graduate,Yes,3000,0.0,66.0,360.0,1.0,Urban,Y
2,LP001006,Male,Yes,0,Not Graduate,No,2583,2358.0,120.0,360.0,1.0,Urban,Y
3,LP001008,Male,No,0,Graduate,No,6000,0.0,141.0,360.0,1.0,Urban,Y
4,LP001013,Male,Yes,0,Not Graduate,No,2333,1516.0,95.0,360.0,1.0,Urban,Y
5,LP001024,Male,Yes,2,Graduate,No,3200,700.0,70.0,360.0,1.0,Urban,Y
6,LP001027,Male,Yes,2,Graduate,,2500,1840.0,109.0,360.0,1.0,Urban,Y
7,LP001029,Male,No,0,Graduate,No,1853,2840.0,114.0,360.0,1.0,Rural,N
8,LP001030,Male,Yes,2,Graduate,No,1299,1086.0,17.0,120.0,1.0,Urban,Y
9,LP001032,Male,No,0,Graduate,No,4950,0.0,125.0,360.0,1.0,Urban,Y


In [None]:
df["Loan_Status"] = df["Loan_Status"].apply(lambda toLabel:0 if toLabel == 'N' else 1)
df["Married"] = df["Married"].apply(lambda toLabel:0 if toLabel == 'No' else 1)
df["Education"] = df["Education"].apply(lambda toLabel:0 if toLabel == 'Not Graduate' else 1)
df["Self_Employed"] = df["Self_Employed"].apply(lambda toLabel:0 if toLabel == 'No' else 1)
df["Property_Area"] = df["Property_Area"].apply(lambda toLabel:0 if toLabel == 'Rural' else 1)
df['Dependents'] = df['Dependents'].astype('category')
df['Dependents'] = df['Dependents'].cat.codes

df.head(10)

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001003,Male,1,1,1,0,4583,1508.0,128.0,360.0,1.0,0,0
1,LP001005,Male,1,0,1,1,3000,0.0,66.0,360.0,1.0,1,1
2,LP001006,Male,1,0,0,0,2583,2358.0,120.0,360.0,1.0,1,1
3,LP001008,Male,0,0,1,0,6000,0.0,141.0,360.0,1.0,1,1
4,LP001013,Male,1,0,0,0,2333,1516.0,95.0,360.0,1.0,1,1
5,LP001024,Male,1,2,1,0,3200,700.0,70.0,360.0,1.0,1,1
6,LP001027,Male,1,2,1,1,2500,1840.0,109.0,360.0,1.0,1,1
7,LP001029,Male,0,0,1,0,1853,2840.0,114.0,360.0,1.0,0,0
8,LP001030,Male,1,2,1,0,1299,1086.0,17.0,120.0,1.0,1,1
9,LP001032,Male,0,0,1,0,4950,0.0,125.0,360.0,1.0,1,1


**Explain the problem you are trying to solve**

In this dataset, I'm diving into how banks decide on loan applications using Multilayer Perceptron (MLP) techniques. By analyzing historical loan data, my goal is to build a prediction model that clarifies the complex process of decision-making.

**Create your own model**

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix, accuracy_score
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.datasets import mnist
from sklearn.model_selection import train_test_split

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Flatten the images
x_train = x_train.reshape((x_train.shape[0], 28 * 28))
x_test = x_test.reshape((x_test.shape[0], 28 * 28))

# Normalize the pixel values
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

# One-hot encode the target variable
y_train = np.eye(10)[y_train]
y_test = np.eye(10)[y_test]

# Split the dataset into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(x_train, y_train, test_size=0.2, random_state=42)

# Scale the features
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

# Define the MLP model
model = Sequential()
model.add(Flatten(input_shape=(28 * 28,)))
model.add(Dense(512, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

# Make predictions on the testing set
predictions = model.predict(x_test)

# Convert the predictions to class labels
predictions = np.argmax(predictions, axis=1)

# Evaluate the model
print('Accuracy Score:')
print(accuracy_score(np.argmax(y_test, axis=1), predictions))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Accuracy Score:
0.96875


**Evaluate the accuracy of your model**

The training results show that the MLP model can effectively understand the features in the MNIST dataset and achieve a significantly high accuracy. Both the accuracy and loss numbers steadily increase over the course of the 10 epochs, demonstrating the model's ability to learn from the training set. Additionally, validation accuracy shows a consistent rise over epochs, although it lags slightly behind training accuracy. Although this may hint to possible overfitting, the marginal difference points to a little rather than a major problem. Moreover, the model shows impressive results on the testing set, resulting in an accuracy score of 0.96875 at the end.