# Exploring Credit Rating Data

Using dummy data, let's explore credit rating data using with the following scenario:
A financial institution needs to process 20 car financing application.
Their criteria is to select an applicant with credit score "good" and above.
How many applicants that will be approved? 

![car_financing](https://d3htfp50oui98k.cloudfront.net/wp-content/uploads/2019/01/11134658/finance.jpg)




In [1]:
#building the dataframe

import pandas as pd
application1 = ['BMW', 'Richmond', 720]
application2 = ['Audi', 'North Vancouver', 800]
application3 = ['Honda', 'Burnaby', 650]
application4 = ['Kia', 'Burnaby', 800]
application5 = ['Audi', 'Vancouver', 740]
application6 = ['BMW', 'Richmond', 725]
application7 = ['Hyundai', 'Vancouver', 855]
application8 = ['Toyota', 'Surrey', 790]
application9 = ['Toyota', 'Langley', 740]
application10 = ['Audi', 'Richmond', 805]
application11 = ['Mercedes Benz', 'Richmond', 695]
application12 = ['Honda', 'Burnaby', 760]
application13 = ['Kia', 'Vancouver', 780]
application14 = ['Audi', 'Vancouver', 845]
application15 = ['BMW', 'Richmond', 695]
application16 = ['Toyota', 'Burnaby', 825]
application17 = ['Kia', 'North Vancouver', 815]
application18 = ['Hyundai', 'Vancouver', 600]
application19 = ['BMW', 'Richmond', 705]
application20 = ['Audi', 'Richmond', 809]
column_names = ['Car Type', 'Location', 'Score']

car_financing_df = pd.DataFrame(data=[application1, application2, application3, application4, application5, application6, application7, application8, application9, application10, 
                                application11, application12, application13, application14, application15, application16, application17, application18, application19, application20], 
                          columns=column_names)
car_financing_df

Unnamed: 0,Car Type,Location,Score
0,BMW,Richmond,720
1,Audi,North Vancouver,800
2,Honda,Burnaby,650
3,Kia,Burnaby,800
4,Audi,Vancouver,740
5,BMW,Richmond,725
6,Hyundai,Vancouver,855
7,Toyota,Surrey,790
8,Toyota,Langley,740
9,Audi,Richmond,805


to get at least good score, the score range starts from 743.

$$
\chi \geq 743
$$

We can use conditional loc function to find out the application with good score

```python
df.loc[df['Score'] >= 743]
```

In [2]:
#to find the application within the criteria of good score and higher

car_financing_good = car_financing_df.loc[car_financing_df['Score'] >=743]
car_financing_good

Unnamed: 0,Car Type,Location,Score
1,Audi,North Vancouver,800
3,Kia,Burnaby,800
6,Hyundai,Vancouver,855
7,Toyota,Surrey,790
9,Audi,Richmond,805
11,Honda,Burnaby,760
12,Kia,Vancouver,780
13,Audi,Vancouver,845
15,Toyota,Burnaby,825
16,Kia,North Vancouver,815


In [3]:
#counting the number of approved application

car_financing_good.count()


Car Type    11
Location    11
Score       11
dtype: int64

using one of pandas function:
    
```python
dataframe.count()
```

We can quickly count the amount of application that would be approved.
The result shows that 11 out of 20 application are met the criteria for approval.

![approved](https://millionmilesecrets.com/wp-content/uploads/Approved-1-1.jpg)


For the total approved application ($\eta$), we can say:

$$
\eta = 11 
$$

In [4]:
#to create a location plot from the approved application table

import altair as alt
car_financing_plot = alt.Chart(car_financing_good).mark_bar().encode(
    x = 'Location',
    y = 'count()'
)
car_financing_plot

In [5]:
#group of car type from the approved application

car_financing_good_car = car_financing_good.groupby(by='Car Type')
car_financing_good_car.groups



{'Audi': [1, 9, 13, 19], 'Honda': [11], 'Hyundai': [6], 'Kia': [3, 12, 16], 'Toyota': [7, 15]}

In conclusion, from 20 application submitted, 11 application would get approved for car financing. From the plot, we can see that Burnaby and Vancouver have the highest amount approved application. There are 5 car types within those approved application, which are Audi, Honda, Hyundai, Kia, and Toyota.
