## Loan Approval Dataset

This dataset simulates loan applications and approval outcomes for 2,000 individuals. It contains demographic, financial, and employment-related attributes that can be used to predict whether a loan application will be approved or rejected.
It is ideal for practicing classification problems, credit risk modeling, and feature engineering for financial datasets.

In [34]:
import pandas as pd 

In [42]:
dataset = pd.read_csv('loan_approval.csv')
dataset.head()

Unnamed: 0,name,city,income,credit_score,loan_amount,years_employed,points,loan_approved
0,Allison Hill,East Jill,113810,389,39698,27,50.0,False
1,Brandon Hall,New Jamesside,44592,729,15446,28,55.0,False
2,Rhonda Smith,Lake Roberto,33278,584,11189,13,45.0,False
3,Gabrielle Davis,West Melanieview,127196,344,48823,29,50.0,False
4,Valerie Gray,Mariastad,66048,496,47174,4,25.0,False


In [43]:
dataset.isnull().sum()

name              0
city              0
income            0
credit_score      0
loan_amount       0
years_employed    0
points            0
loan_approved     0
dtype: int64

In [44]:
# we are dropping the some column so with the helps of income,credit score ,load amount ,year employed and point try to predict the output.
dataset.drop(['name','city'],axis=1,inplace=True) 

In [45]:
dataset['loan_approved'].value_counts()

loan_approved
False    1121
True      879
Name: count, dtype: int64

In [46]:
# encoding 
dataset['loan_approved'] = dataset['loan_approved'].map({False:0,True:1})

In [47]:
dataset['loan_approved'].value_counts()

loan_approved
0    1121
1     879
Name: count, dtype: int64

In [48]:
dataset.head()

Unnamed: 0,income,credit_score,loan_amount,years_employed,points,loan_approved
0,113810,389,39698,27,50.0,0
1,44592,729,15446,28,55.0,0
2,33278,584,11189,13,45.0,0
3,127196,344,48823,29,50.0,0
4,66048,496,47174,4,25.0,0


In [49]:
x = dataset.drop(['loan_approved'],axis=1)

In [50]:
x.head()

Unnamed: 0,income,credit_score,loan_amount,years_employed,points
0,113810,389,39698,27,50.0
1,44592,729,15446,28,55.0
2,33278,584,11189,13,45.0
3,127196,344,48823,29,50.0
4,66048,496,47174,4,25.0


In [51]:
y = dataset['loan_approved']

In [52]:
# spliting the train and test 
from sklearn.model_selection import train_test_split 
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.3,random_state=50)

In [53]:
# scaling the dataset 
from sklearn.preprocessing import StandardScaler
scalar = StandardScaler()
x_train_scaled = scalar.fit_transform(x_train)
x_test_scaled = scalar.transform(x_test)

## Traning model

In [54]:
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier,GradientBoostingClassifier,AdaBoostClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier


models = {
    "Logistic regression": LogisticRegression(),
    "SVM Classifier":SVC(),
    "Decision Tree Classifier":DecisionTreeClassifier(),
    "KNeighbour Classifier":KNeighborsClassifier(),
    "Adaboost Classifier":AdaBoostClassifier(),
    "Random Forest Classifier":RandomForestClassifier(),
    "Gradient boosting":GradientBoostingClassifier()
}

In [57]:
from sklearn.metrics import accuracy_score,classification_report,confusion_matrix
for col in list(models):
    model = models[col]
    model.fit(x_train_scaled,y_train)
    predict = model.predict(x_test_scaled)
    score = accuracy_score(y_test,predict)
    report = classification_report(y_test,predict)
    conf_metrix = confusion_matrix(y_test,predict)
    print()
    print(col)
    print(f"Accuracy:{score}")
    print(f"Classification report:{report}")
    print(f"Confusion metrix:{confusion_matrix}")
    print("="*100)


Logistic regression
Accuracy:1.0
Classification report:              precision    recall  f1-score   support

           0       1.00      1.00      1.00       351
           1       1.00      1.00      1.00       249

    accuracy                           1.00       600
   macro avg       1.00      1.00      1.00       600
weighted avg       1.00      1.00      1.00       600

Confusion metrix:<function confusion_matrix at 0x1753156c0>

SVM Classifier
Accuracy:0.9966666666666667
Classification report:              precision    recall  f1-score   support

           0       1.00      0.99      1.00       351
           1       0.99      1.00      1.00       249

    accuracy                           1.00       600
   macro avg       1.00      1.00      1.00       600
weighted avg       1.00      1.00      1.00       600

Confusion metrix:<function confusion_matrix at 0x1753156c0>

Decision Tree Classifier
Accuracy:1.0
Classification report:              precision    recall  f1-score