## Table of Contents: Support Vector Machine – Loan Approval Classification

Last Edited: September 30th, 2024

1. Uploading Dataset (`LoanDataset`)
2. Descriptive Analysis (`head()`)
3. Handling Missing Values (dropna or fillna)
4. Encoding Categorical Features
5. Feature and Target Definition (`X`, `y`)
6. Train–Test Split
7. Support Vector Machine Model Training (`SVC`)
8. Model Prediction on Test Set
9. Model Evaluation: Accuracy Score

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.svm import SVC

from sklearn.metrics import accuracy_score
from sklearn.metrics import mean_squared_error, r2_score

from google.colab import drive
drive.mount("/content/drive")

path = "/content/drive/MyDrive/Kellton Tech/Model Code/dataset/LoanDataset - LoansDatasest.csv"
df= pd.read_csv(path)

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
df.isnull().sum()
df_data = df.dropna()
df_data.head()

Unnamed: 0,customer_id,customer_age,customer_income,home_ownership,employment_duration,loan_intent,loan_grade,loan_amnt,loan_int_rate,term_years,historical_default,cred_hist_length,Current_loan_status
0,1.0,22,59000,RENT,123.0,PERSONAL,C,"£35,000.00",16.02,10,Y,3,DEFAULT
2,3.0,25,9600,MORTGAGE,1.0,MEDICAL,B,"£5,500.00",12.87,5,N,3,DEFAULT
3,4.0,23,65500,RENT,4.0,MEDICAL,B,"£35,000.00",15.23,10,N,2,DEFAULT
4,5.0,24,54400,RENT,8.0,MEDICAL,B,"£35,000.00",14.27,10,Y,4,DEFAULT
5,6.0,21,9900,OWN,2.0,VENTURE,A,"£2,500.00",7.14,1,N,2,DEFAULT


In [None]:
Q1 = df_data['customer_age'].quantile(0.25)
Q3 = df_data['customer_age'].quantile(0.75)
IQR = Q3 - Q1

lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

df_data1 = (df_data['customer_age'] >= lower_bound) & (df_data['customer_age'] <= upper_bound)

In [None]:
df_data2 = df_data[df_data1]

In [None]:
target = 'Current_loan_status'
X = df_data2.drop(columns=[target])
y = df_data2[target]

In [None]:
encoder = LabelEncoder()
y = encoder.fit_transform(y)

In [None]:
X = pd.get_dummies(X, drop_first=True)

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 42)

In [None]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
svm=SVC()
svm.fit(X_train,y_train)
y_pred=svm.predict(X_test)

print("Accuracy score:", accuracy_score(y_test, y_pred))

Accuracy score: 0.8310563145618127
