![Credit card being held in hand](credit_card.jpg)

Commercial banks receive _a lot_ of applications for credit cards. Many of them get rejected for many reasons, like high loan balances, low income levels, or too many inquiries on an individual's credit report, for example. Manually analyzing these applications is mundane, error-prone, and time-consuming (and time is money!). Luckily, this task can be automated with the power of machine learning and pretty much every commercial bank does so nowadays. In this workbook, you will build an automatic credit card approval predictor using machine learning techniques, just like real banks do.

### The Data

The data is a small subset of the Credit Card Approval dataset from the UCI Machine Learning Repository showing the credit card applications a bank receives. This dataset has been loaded as a `pandas` DataFrame called `cc_apps`. The last column in the dataset is the target value.

In [8]:
# Import necessary libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import LabelEncoder
import torch.nn as nn
import torch
from torch.utils.data import DataLoader, TensorDataset

# Load the dataset
cc_apps = pd.read_csv("cc_approvals.data", header=None)  # Load the credit card approvals dataset
cc_apps.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13
0,b,30.83,0.0,u,g,w,v,1.25,t,t,1,g,0,+
1,a,58.67,4.46,u,g,q,h,3.04,t,t,6,g,560,+
2,a,24.5,0.5,u,g,q,h,1.5,t,f,0,g,824,+
3,b,27.83,1.54,u,g,w,v,3.75,t,t,5,g,3,+
4,b,20.17,5.625,u,g,w,v,1.71,t,f,0,s,0,+


In [9]:
# Columns with categorical (non-numeric) data
letter_idx = [0, 3, 4, 5, 6, 8, 9, 11, 13]
# Columns with numeric data
num_idx = [1, 2, 7, 10, 12]

# Replace '?' in categorical columns with the most frequent value (mode)
for idx in letter_idx:
    max_count_value = cc_apps[idx].value_counts().idxmax()
    cc_apps[idx] = cc_apps[idx].replace('?', max_count_value)

# Replace '?' in numeric columns with the mean of the column
for idx in num_idx:
    cc_apps[idx] = pd.to_numeric(cc_apps[idx], errors='coerce')  # Convert to numeric
    average_value = cc_apps[idx].mean()
    cc_apps[idx] = cc_apps[idx].fillna(average_value)  # Replace NaN with mean

# Encode all columns to numeric values using LabelEncoder
for idx in range(14):
    cc_apps[idx] = LabelEncoder().fit_transform(cc_apps[idx])

In [10]:
# Split the dataset into features (X) and target (Y)
X = cc_apps.drop(13, axis=1)  # Features
Y = cc_apps[[13]]  # Target variable

# Split into training and testing sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=42)

# Standardize the training features
scaler_train = StandardScaler()
X_train_scale = scaler_train.fit_transform(X_train)  # Fit and transform training data
X_train_scale = torch.tensor(X_train_scale, dtype=torch.float)  # Convert to PyTorch tensor
Y_train = torch.tensor(Y_train.values, dtype=torch.float)  # Convert target to PyTorch tensor

# Standardize the testing features
scaler_test = StandardScaler()
X_test_scale = scaler_test.fit_transform(X_test)  # Fit and transform testing data
X_test_scale = torch.tensor(X_test_scale, dtype=torch.float)  # Convert to PyTorch tensor
Y_test = torch.tensor(Y_test.values, dtype=torch.float)  # Convert target to PyTorch tensor

In [11]:
# Define the neural network architecture
class MultiLayerNN(nn.Module):
    def __init__(self, in_dim, hidden_dim, out_dim):
        super().__init__()
        self.fc1 = nn.Linear(in_dim, hidden_dim)  # First fully connected layer
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)  # Second fully connected layer
        self.output = nn.Linear(hidden_dim, out_dim)  # Output layer
        self.relu = nn.ReLU()  # ReLU activation function
        self.sigmoid = nn.Sigmoid()  # Sigmoid activation for binary classification
    
    def forward(self, x):
        # Forward pass through the layers
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))  # Adding multiple passes through the hidden layer
        x = self.relu(self.fc2(x))
        x = self.relu(self.fc2(x))
        x = self.relu(self.fc2(x))
        x = self.sigmoid(self.output(x))  # Final sigmoid activation for output
        return x

# Initialize hyperparameters and the model
epochs = 600  # Number of training epochs
learning_rate = 0.15  # Learning rate for optimizer
model = MultiLayerNN(X_train_scale.shape[1], 256, 1)  # Create the model
criterion = nn.BCELoss()  # Binary Cross-Entropy Loss for binary classification
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)  # SGD optimizer
model.train()  # Set model to training mode

# Create DataLoader for training and testing datasets
dataset_train = TensorDataset(X_train_scale, Y_train)  # Training dataset
dataloader_train = DataLoader(dataset_train, batch_size=10, shuffle=True)  # DataLoader for training

dataset_test = TensorDataset(X_test_scale, Y_test)  # Testing dataset
dataloader_test = DataLoader(dataset_test, batch_size=10, shuffle=True)  # DataLoader for testing

In [12]:
# Training loop
for epoch in range(epochs):
    training_losses = []  # To track training loss for each epoch
    for batch in dataloader_train:
        x, y = batch[0], batch[1]  # Get a batch of data
        optimizer.zero_grad()  # Zero the gradient buffers
        pred = model(x)  # Forward pass
        loss = criterion(pred, y)  # Compute loss
        loss.backward()  # Backpropagation
        optimizer.step()  # Update model parameters
        training_losses.append(loss.item())  # Store the loss
    if (epoch + 1) % 50 == 0:
        print(f"Training loss of Epoch {epoch + 1}: {np.mean(training_losses)}")  # Print loss every 50 epochs

Training loss of Epoch 50: 0.019190441194485937
Training loss of Epoch 100: 0.005411607967778208
Training loss of Epoch 150: 0.005028096604128507
Training loss of Epoch 200: 0.004890313664781534
Training loss of Epoch 250: 0.010260682418304922
Training loss of Epoch 300: 0.00435260547092498
Training loss of Epoch 350: 0.00448014042639456
Training loss of Epoch 400: 0.003321779052191922
Training loss of Epoch 450: 0.002221178280446594
Training loss of Epoch 500: 0.002076605235285636
Training loss of Epoch 550: 0.004845103858863731
Training loss of Epoch 600: 0.005474136827418812


In [None]:
# Evaluate the model on the training data
with torch.no_grad():  # Disable gradient computation for evaluation
    model.eval()  # Set model to evaluation mode
    num_correct = 0
    total_size = 0
    for batch in dataloader_train:
        x, y = batch[0], batch[1]
        pred = model(x)  # Forward pass
        pred_prob = torch.sigmoid(pred)  # Apply sigmoid to get probabilities
        pred_labels = (pred_prob > 0.5).float()  # Convert probabilities to binary predictions
        num_correct += (pred_labels == y).sum().item()  # Count correct predictions
        total_size += len(x)  # Count total samples
    best_score = num_correct / total_size  # Calculate accuracy
    print(f"Accuracy for training data: {best_score}")

Accuracy for training data: 0.9057971014492754


In [None]:
# Evaluate the model on the testing data
with torch.no_grad():  # Disable gradient computation for evaluation
    model.eval()  # Set model to evaluation mode
    num_correct = 0
    total_size = 0
    for batch in dataloader_test:
        x, y = batch[0], batch[1]
        pred = model(x)  # Forward pass
        pred_prob = torch.sigmoid(pred)  # Apply sigmoid to get probabilities
        pred_labels = (pred_prob > 0.5).float()  # Convert probabilities to binary predictions
        num_correct += (pred_labels == y).sum().item()  # Count correct predictions
        total_size += len(x)  # Count total samples
    print(f"Accuracy for testing data: {num_correct / total_size}")  # Print testing accuracy

Accuracy for testing data: 0.7391304347826086
