<a href="https://colab.research.google.com/github/Harithamuralidharan/Financial-Risk-for-Loan-Approval/blob/main/Financial_Risk_For_Loan_Approvel.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Financial Risk For Loan Approval**

##Introduction
The fast moving financial world,bank and lending instutions face the constant challenge of approving loans minimizing financial risk.A wrong decision either approving a risk borrower or rejecting a reliable one can have serious consequences. To tackle this,lenders use **data_driven risk assessment** to make smarter,more informed loan approval decisions.This dataset contains crucial financial and personal details about loan applicants,including their **credit history,income,debts,assets and employment status**. By analysing these factors we can predict whether the applicant's to **repay the loan on time or default**.


##Dataset information

Loan default can be a major problem for financial institutions, leading to significant losses.The challenge is to accurately **identify risk borrowers** while ensuring that deserving applicants get access to credit.Traditional credit scoring methods often miss key financial behaviors, and manual assessment can be biased or inconsistent.The goal of this project is to **develop a reliable risk assessment model** that can predict loan approvel
based on historical applicant data.By leveraging machine learning and data analytics, we aim to reduce bad loans, improve lending efficiency, and create a more transparent and fair loan approval process.

##Core Stages and component


*   Exploratory Data Analysis**(EDA)**
*   Data Preprocessing-Data cleaning
*   Handling Missing values
*   Feature Engineering
*   Feature selection
*   Feature Scaling
*   Spliting data for train and test
*   Machine learning model building
*   model evaluation



##variables:

|s.no|Variable Name              |Role               |type      |Decription
|--|--------------------------|-------------------|----------|----------
|1.|Age                        |feature            |Continuos|Applicant's age
|2.|AnnualIncome               |Feature            |Continuos|Yearly income
|3.|CreditScore                |Feature            |Continuos|Creditworthiness score
|4.|EmploymentStatus           |Feature            |Categorical|Job situation
|5.|EducationLevel             |Feature            |Categorical|Highest education attained
|6.|Experiance                 |Feature            |Continuos|work experience
|7.|LoanAmount                 |Feature            |Continuos|Requested loan size
|8.|LoanDuration               |Feature            |Continuos|Loan repayment period
|9.|MaritalStatus              |Feature            |Categorical|Applicant's marial state
|10.|NumberOfDependents        |Feature            |Continuos|Number of dependents
|11.|HomeOwnershipStatus       |Feature            |Categorical|Homeownership type
|12.|MonthlyDebtPayment        |Feature            |Continuos|Monthly debt obligations
|13.|CreditCard UtilizationRate|Feature            |Continuos|Credit care usage percentage
|14.|NumberOfOpenCreditLines   |Feature            |Continuos |Active credit lines
|15.|NumberOfCreditlnquiries   |Feature            | Continuos |Credit check count
|16.|DebtToIncomeRatio         |Feature            |Continuos|Debit to income proportion
|17.|BankruptcyHistory         |Feature             | Categorical |Bankruptcy records
|18.|LoanPurpose               |Feature             |Continuos|Reason for loan
|19.|PreviousLoanDefaults      |Feature             |Continuos|Prior loan defaults
|20.|PaymentHistory            |Feature             |Continuos|Past pyment behavior
|21.|LengthOfCreditHistory     |Feature             |Continuos|Credit history duration
|22.|SavingsAccountBalance     |Feature              |Continuos|Savings account amount
|23.|CheckingAccountBalance    |Feature            |Continuos|Checking account fund
|24.|TotalAssets               |Feature            |Continuos|Total owned assets
|25.|TotalLiabilities          |Feature            |Continuos|Total owed debts
|26.|MonthlyIncome             |Feature            |Continuos|Income per month
|27.|UtilityBillsPaymentHistory|Feature            |Continuos|Utility payment record
|28.|JobTenure                 |Feature            |Continuos|Job duration
|29.|NetWorth                  |Feature            |Continuos|Total financial worth
|30.|BaselnterestRate          |Feature            |Continuos| Starting interest rate
|31.|InterestRate              |Feature            |Continuos|Applied interest rate
|32.|MonthlyLoanPayment        |Feature            |Continuos|Monthly loan payment
|33.|TotalDebtToIncomeRatio    |Feature            |Continuos|Total debt against income
|34.|LoanApproved              |Target             |Continuos|Loan approval status
|35.|RiskScore                 |Feature            |Continuos|Risk assessment score

In [None]:
#import the libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

#loading the dataset
data = pd.read_csv("Loan.csv")

In [None]:
#Initial EDA

# view the top 5 rows in dataset
data.head(5)


Unnamed: 0,Age,AnnualIncome,CreditScore,EmploymentStatus,EducationLevel,Experience,LoanAmount,LoanDuration,MaritalStatus,NumberOfDependents,...,MonthlyIncome,UtilityBillsPaymentHistory,JobTenure,NetWorth,BaseInterestRate,InterestRate,MonthlyLoanPayment,TotalDebtToIncomeRatio,LoanApproved,RiskScore
0,45.0,39948.0,617,Employed,Master,22.0,13152.0,48.0,Married,,...,,0.724972,11.0,126928.0,0.199652,0.22759,419.805992,0.181077,0,49.0
1,38.0,39709.0,628,Employed,Associate,15.0,26045.0,,Single,1.0,...,3309.083333,,3.0,43609.0,0.207045,0.201077,794.054238,0.389852,0,52.0
2,47.0,40724.0,570,Employed,Bachelor,26.0,,,Married,2.0,...,,0.872241,6.0,5205.0,0.217627,0.212548,,,0,52.0
3,,69084.0,545,Employed,High School,34.0,37898.0,96.0,Single,1.0,...,5757.0,0.896155,5.0,99452.0,0.300398,0.300911,1047.50698,0.313098,0,54.0
4,37.0,103264.0,594,Employed,Associate,17.0,9184.0,36.0,Married,,...,8605.333333,0.941369,5.0,227019.0,0.197184,0.17599,330.179141,,1,36.0


In [None]:
#view the bottom 5 rows in dataset
data.tail(5)

Unnamed: 0,Age,AnnualIncome,CreditScore,EmploymentStatus,EducationLevel,Experience,LoanAmount,LoanDuration,MaritalStatus,NumberOfDependents,...,MonthlyIncome,UtilityBillsPaymentHistory,JobTenure,NetWorth,BaseInterestRate,InterestRate,MonthlyLoanPayment,TotalDebtToIncomeRatio,LoanApproved,RiskScore
19995,44.0,30180.0,587,Employed,High School,19.0,24521.0,36.0,Married,3.0,...,2515.0,0.826217,1.0,55327.0,0.216021,0.195574,905.767712,0.627741,0,55.0
19996,56.0,49246.0,567,Employed,Associate,33.0,25818.0,36.0,Married,5.0,...,4103.833333,0.816618,3.0,64002.0,0.227318,0.199168,958.395633,0.334418,0,54.0
19997,44.0,48958.0,645,Employed,Bachelor,20.0,37033.0,72.0,Married,3.0,...,,0.887216,3.0,103663.0,0.229533,0.226766,945.427454,0.357227,0,45.0
19998,60.0,41025.0,560,Employed,High School,36.0,14760.0,72.0,,3.0,...,3418.75,0.843787,5.0,10600.0,0.24976,0.264873,411.168284,0.408678,0,59.0
19999,20.0,53227.0,574,Employed,Associate,0.0,32055.0,48.0,Married,,...,,0.853801,5.0,41372.0,0.240055,0.242693,1049.830407,0.298006,0,59.0


In [None]:
#Detail over view of the dataset
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20000 entries, 0 to 19999
Data columns (total 35 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   Age                         18251 non-null  float64
 1   AnnualIncome                18719 non-null  float64
 2   CreditScore                 20000 non-null  int64  
 3   EmploymentStatus            17263 non-null  object 
 4   EducationLevel              18303 non-null  object 
 5   Experience                  17396 non-null  float64
 6   LoanAmount                  18847 non-null  float64
 7   LoanDuration                18611 non-null  float64
 8   MaritalStatus               17003 non-null  object 
 9   NumberOfDependents          17146 non-null  float64
 10  HomeOwnershipStatus         18862 non-null  object 
 11  MonthlyDebtPayments         18213 non-null  float64
 12  CreditCardUtilizationRate   18621 non-null  float64
 13  NumberOfOpenCreditLines     185

In [None]:
#Return the number of records and features in tuple
data.shape

(20000, 35)

In [None]:
#check the missing values in the data set
data.isnull().sum()

Unnamed: 0,0
Age,1749
AnnualIncome,1281
CreditScore,0
EmploymentStatus,2737
EducationLevel,1697
Experience,2604
LoanAmount,1153
LoanDuration,1389
MaritalStatus,2997
NumberOfDependents,2854


In [None]:
#statistical summary of the dataset
data.describe()

Unnamed: 0,Age,AnnualIncome,CreditScore,Experience,LoanAmount,LoanDuration,NumberOfDependents,MonthlyDebtPayments,CreditCardUtilizationRate,NumberOfOpenCreditLines,...,MonthlyIncome,UtilityBillsPaymentHistory,JobTenure,NetWorth,BaseInterestRate,InterestRate,MonthlyLoanPayment,TotalDebtToIncomeRatio,LoanApproved,RiskScore
count,18251.0,18719.0,20000.0,17396.0,18847.0,18611.0,17146.0,18213.0,18621.0,18586.0,...,17928.0,18712.0,18862.0,17777.0,18163.0,17287.0,18846.0,18109.0,20000.0,20000.0
mean,39.769876,59126.566697,571.6124,17.526903,24888.323128,53.968083,1.520588,454.831384,0.286318,3.031798,...,4897.727526,0.799932,5.005779,72279.46,0.239137,0.239233,912.013561,0.401987,0.239,50.76678
std,11.612307,40345.662812,50.997358,11.339904,13443.215537,24.643069,1.389189,241.531978,0.159776,1.7312,...,3297.612278,0.120924,2.236653,117987.8,0.035633,0.042083,674.887461,0.340166,0.426483,7.778262
min,18.0,15000.0,343.0,0.0,3674.0,12.0,0.0,60.0,0.000974,0.0,...,1250.0,0.270036,0.0,1000.0,0.130101,0.11331,97.030193,0.016043,0.0,28.8
25%,32.0,31571.5,540.0,9.0,15593.0,36.0,0.0,286.0,0.160907,2.0,...,2625.770833,0.727258,3.0,8739.0,0.213846,0.209347,494.273727,0.179518,0.0,46.0
50%,40.0,48445.0,578.0,17.0,21904.0,48.0,1.0,402.0,0.266493,3.0,...,4044.708333,0.820795,5.0,32891.0,0.236161,0.235679,729.389161,0.302429,0.0,52.0
75%,48.0,74374.0,609.0,25.0,30836.5,72.0,3.0,564.0,0.390221,4.0,...,6175.979167,0.892627,6.0,88413.0,0.261634,0.265454,1113.154416,0.508214,0.0,56.0
max,80.0,422480.0,712.0,61.0,184732.0,120.0,5.0,2919.0,0.91738,12.0,...,25000.0,0.999433,16.0,2603208.0,0.405029,0.446787,10892.62952,4.647657,1.0,84.0


In [None]:
#check the number of duplicate records

data.duplicated().sum()

0

In [None]:
# get all categorial columns in the dataset
cate_col =data.select_dtypes('O').columns.tolist()
cate_col

['EmploymentStatus',
 'EducationLevel',
 'MaritalStatus',
 'HomeOwnershipStatus',
 'LoanPurpose']

In [None]:
#get all numerical columns in the dataset
num_col =data.select_dtypes(include='number').columns.tolist()
num_col


['Age',
 'AnnualIncome',
 'CreditScore',
 'Experience',
 'LoanAmount',
 'LoanDuration',
 'NumberOfDependents',
 'MonthlyDebtPayments',
 'CreditCardUtilizationRate',
 'NumberOfOpenCreditLines',
 'NumberOfCreditInquiries',
 'DebtToIncomeRatio',
 'BankruptcyHistory',
 'PreviousLoanDefaults',
 'PaymentHistory',
 'LengthOfCreditHistory',
 'SavingsAccountBalance',
 'CheckingAccountBalance',
 'TotalAssets',
 'TotalLiabilities',
 'MonthlyIncome',
 'UtilityBillsPaymentHistory',
 'JobTenure',
 'NetWorth',
 'BaseInterestRate',
 'InterestRate',
 'MonthlyLoanPayment',
 'TotalDebtToIncomeRatio',
 'LoanApproved',
 'RiskScore']

In [None]:
#check the unique value count in the categorical data's

for i in data[cate_col]:
  unique_values= data[cate_col].value_counts()
  print(f'{cate_col}:{unique_values}')


['EmploymentStatus', 'EducationLevel', 'MaritalStatus', 'HomeOwnershipStatus', 'LoanPurpose']:EmploymentStatus  EducationLevel  MaritalStatus  HomeOwnershipStatus  LoanPurpose       
Employed          High School     Married        Mortgage             Home                  196
                  Bachelor        Married        Mortgage             Debt Consolidation    176
                  High School     Married        Mortgage             Debt Consolidation    162
                  Bachelor        Married        Mortgage             Home                  159
                  High School     Married        Rent                 Home                  137
                                                                                           ... 
Unemployed        Doctorate       Married        Own                  Debt Consolidation      1
                                                                      Education               1
                                                 

In [None]:
#find the count of unique values for target loan appproval column
data['LoanApproved'].value_counts()

Unnamed: 0_level_0,count
LoanApproved,Unnamed: 1_level_1
0,15220
1,4780
