## Credit Score Classification: Case Study

The credit score of a person determines the creditworthiness of the person. It helps financial companies determine if you can repay the loan or credit you are applying for.

There are three credit scores that banks and credit card companies use to label their customers:

* Good
* Standard
* Poor

A person with a good credit score will get loans from any bank and financial institution.

You are required to find relationships based on how banks classify credit scores and train a model to classify the credit score of a person.

In [None]:
# Importing necessary python libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
pio.templates.default = "plotly_white"

In [None]:
# Importing the dataset and creating a dataframe
data = pd.read_csv('train.csv')

In [None]:
# Displaying first five rows of the dataset
data.head()

In [None]:
# Check the shape of the dataset
data.shape

The dataset contains 100000 rows and 28 columns

In [None]:
# Information about the colouns of the dataset
data.info()

There are 7 categorical columns and 21 numerical columns present in the dataset

In [None]:
# Checking the missing values in the dataset
data.isnull().sum()

The dataset doesn't have any null values

### Exploratory Data Analysis

**Analysing Credit Score**

In [None]:
print(data['Credit_Score'].value_counts())
data['Credit_Score'].value_counts().plot(kind='bar')
plt.show()

**Analysing Imapct of Various Occupations on Credit Score**

In [None]:
fig = px.box(data, x='Occupation', color='Credit_Score',
            title='Credit Scores Based on Occupation',
            color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.show()

There is not much difference in the credit scores of all occupations mentioned in the data. 

**Analysing Impact of Income on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Annual_Income',  
             color='Credit_Score',
             title='Credit Scores Based on Annual Income',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

The more the annual income of the people, the better will be there credit score.

**Analysing Impact of Monthly In hand Salary on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Monthly_Inhand_Salary',  
             color='Credit_Score',
             title='Credit Scores Based on Monthly_Inhand_Salary',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

The more the monthly income of the people, the better there credit score will become.

**Analysing Impact of Number of Bank Accounts on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Num_Bank_Accounts',  
             color='Credit_Score',
             title='Credit Scores Based on Number of Bank Accounts',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

Maintaining more than five bank accounts is not good for good credit score. A person should have 2-3 bank accounts only. So having more bank accounts impacts the credit score.

**Analysing Impact of Number of Credit Cards on Credit Score** 

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Num_Credit_Card',  
             color='Credit_Score',
             title='Credit Scores Based on Number of Credit Cards',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

Just like the bank accounts, having more credit cards will have impact on your credit score. Having 3-5 cards is good for your credit score.

**Analysing Impact of Interest Rate of loans and EMIs on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Interest_Rate',  
             color='Credit_Score',
             title='Credit Scores Based on Avg Interest Rate',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

If the average interest is 4-11%, the credit score is good. If the average interest rate is more than 15%, is bad for your credit score.

**Analysing Impact of Number of loans taken on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Num_of_Loan',  
             color='Credit_Score',
             title='Credit Scores Based on Number of Loans',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

To have a good credit score, a person should have 1-3 loans. More than 3 loans will negatively impacts your credit score.

**Analysing Impact of Type of Loan taken on Credit Score**

In [None]:
fig = px.box(data, x='Type_of_Loan', color='Credit_Score',
            title='Credit Scores Based on Type of Loans',
            color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.show()

Loans such as Presonal loans, home equity loans, mortgage loans and auto loans or credit builder loans give a better credit score than other types of loans.

**Analysing Impact of Delay of Payment from due date on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Delay_from_due_date',  
             color='Credit_Score',
             title='Credit Scores Based on Avg number of days delayed for credit card payments',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

Delaying the credit card payments from 5-14 days from the due date will not affects the credit score. But Delaying more than 17 days from the due date will negatively impacts the credit score. 

**Analysing Impact of Number of delay in payments on credit score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Num_of_Delayed_Payment',  
             color='Credit_Score',
             title='Credit Scores Based on Number of days delayed payments',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

Delaying 4-12 payments will not affects the credit score. But delaying more than 12 payments will negatively impacts the credit score.

**Analysing Impact of Outstanding debts on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Outstanding_Debt',  
             color='Credit_Score',
             title='Credit Scores Based on Outstanding Debts',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

An outstanding debt of dollar 380-1150 will not affect your credit scores. But always having a debt of more than dollar 1338 will affect your credit scores negatively.

**Analysing Imapct of Credit Utilization Ratio on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Credit_Utilization_Ratio',  
             color='Credit_Score',
             title='Credit Scores Based on Credit Utilization Ratio',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

According to the visualization, the credit utilization ratio does not affects the credit scores of a person.

**Analysing impact of Credit History Age on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Credit_History_Age',  
             color='Credit_Score',
             title='Credit Scores Based on Credit History Age',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

Having a long credit history results in better credit score.

**Analysing Impact of Total EMI per month on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Total_EMI_per_month',  
             color='Credit_Score',
             title='Credit Scores Based on Total EMI per month',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

The number of EMI payments does not affects the credit score of a person.

**Analysing iImpact of Monthly Amount Invested  on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Amount_invested_monthly',  
             color='Credit_Score',
             title='Credit Scores Based on Monthly Invested Amount ',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

The amount of money invested monthly does not affects the credit score of a person.

**Analysing Impact of Monthly Balance on Credit Score**

In [None]:
fig = px.box(data, x='Credit_Score', y= 'Monthly_Balance',  
             color='Credit_Score',
             title='Credit Scores Based on Monthly Balance',
             color_discrete_map={'Poor':'red', 'Standard':'yellow', 'Good':'green'})
fig.update_traces(quartilemethod='exclusive')
fig.show()

Having a high monthly balance is good for the credit score of a person. A monthly balance of less than dollar 250 is bad for the credit score.

## MODEL BUILDING

In [None]:
## Transforming Categorical Column into Numerical Column

data['Credit_Mix'] = data['Credit_Mix'].map({'Standard': 1, 'Good': 2, 'Bad': 3})

In [None]:
from sklearn.model_selection import train_test_split
x = np.array(data[["Annual_Income", "Monthly_Inhand_Salary", 
                   "Num_Bank_Accounts", "Num_Credit_Card", 
                   "Interest_Rate", "Num_of_Loan", 
                   "Delay_from_due_date", "Num_of_Delayed_Payment", 
                   "Credit_Mix", "Outstanding_Debt", 
                   "Credit_History_Age", "Monthly_Balance"]])
y = np.array(data[["Credit_Score"]])

In [None]:
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.33, random_state=42)
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
model.fit(xtrain, ytrain)

In [None]:
print("Credit Score Prediction : ")
a = float(input("Annual Income: "))
b = float(input("Monthly Inhand Salary: "))
c = float(input("Number of Bank Accounts: "))
d = float(input("Number of Credit cards: "))
e = float(input("Interest rate: "))
f = float(input("Number of Loans: "))
g = float(input("Average number of days delayed by the person: "))
h = float(input("Number of delayed payments: "))
i = input("Credit Mix (Bad: 0, Standard: 1, Good: 3) : ")
j = float(input("Outstanding Debt: "))
k = float(input("Credit History Age: "))
l = float(input("Monthly Balance: "))

features = np.array([[a, b, c, d, e, f, g, h, i, j, k, l]])
print("Predicted Credit Score = ", model.predict(features))