### Pendahuluan

Bagi sebagian besar financial institutions, seperti bank dan perusahaan multi-finance, sumber pendapatan utama mereka berasal dari kegiatan pinjaman mereka. Dengan melakukan kegiatan ini, berarti pemberi pinjaman dihadapkan pada potensi risiko, di mana debitur berhenti membayar pinjamannya sehingga menyebabkan kerugian bagi pemberi pinjaman. Untuk mengurangi kerugian ini, pemberi pinjaman diharapkan dengan tepat memilih siapa yang memenuhi syarat untuk pinjaman, pada tingkat berapa, dan berapa jumlahnya.

### Data yang Digunakan

Dataset yang digunakan berasal dari perusahaan multi-finance.

Untuk detail datanya adalah sebagai berikut:
- `LN_ID`, Loan ID
- `TARGET`, Target variable (1 - client with late payment more than X days, 0 - all other cases)
- `CONTRACT_TYPE`, Identification if loan is cash or revolving
- `GENDER`, Gender of the client
- `NUM_CHILDREN`, Number of children the client has
- `INCOME`, Monthly income of the client
- `APPROVED_CREDIT`, Approved credit amount of the loan
- `ANNUITY`, Loan annuity (amount that must be paid monthly)
- `PRICE`, For consumer loans it is the price of the goods for which the loan is given
- `INCOME_TYPE`," Clients income type (businessman, working, maternity leave,…)"
- `EDUCATION`,The client highest education
- `FAMILY_STATUS`, Family status of the client
- `HOUSING_TYPE`, "What is the housing situation of the client (renting, living with parents, ...)"
- `DAYS_AGE`, Client's age in days at the time of application
- `DAYS_WORK`, How many days before the application the person started current job
- `DAYS_REGISTRATION`, How many days before the application did client change his registration
- `DAYS_ID_CHANGE`, How many days before the application did client change the identity document with which he applied for the loan
- `WEEKDAYS_APPLY`, On which day of the week did the client apply for the loan
- `HOUR_APPLY`, Approximately at what hour did the client apply for the loan
- ORGANIZATION_TYPE,Type of organization where client works
- `EXT_SCORE_1`, Normalized score from external data source
- `EXT_SCORE_2`, Normalized score from external data source
- `EXT_SCORE_3`, Normalized score from external data source

### Library yang Digunakan

In [5]:
import warnings
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import confusion_matrix, classification_report, f1_score, roc_curve, roc_auc_score

sns.set(style='darkgrid')
pd.options.display.max_columns = 50
warnings.filterwarnings("ignore")
%matplotlib inline

In [4]:
pd.read_csv('./Dataset/app_train.csv').head()

Unnamed: 0.1,Unnamed: 0,LN_ID,TARGET,CONTRACT_TYPE,GENDER,NUM_CHILDREN,INCOME,APPROVED_CREDIT,ANNUITY,PRICE,INCOME_TYPE,EDUCATION,FAMILY_STATUS,HOUSING_TYPE,DAYS_AGE,DAYS_WORK,DAYS_REGISTRATION,DAYS_ID_CHANGE,WEEKDAYS_APPLY,HOUR_APPLY,ORGANIZATION_TYPE,EXT_SCORE_1,EXT_SCORE_2,EXT_SCORE_3
0,201468,333538,0,Revolving loans,F,1,67500.0,202500.0,10125.0,202500.0,Working,Secondary / secondary special,Married,With parents,-11539,-921,-119.0,-2757,TUESDAY,18,Business Entity Type 3,0.572805,0.608276,
1,264803,406644,0,Cash loans,F,1,202500.0,976711.5,49869.0,873000.0,Commercial associate,Secondary / secondary special,Married,House / apartment,-15743,-4482,-1797.0,-2455,TUESDAY,14,Other,0.6556,0.684298,
2,137208,259130,0,Cash loans,F,0,180000.0,407520.0,25060.5,360000.0,Pensioner,Secondary / secondary special,Married,House / apartment,-20775,365243,-8737.0,-4312,THURSDAY,14,NA1,,0.580687,0.749022
3,269220,411997,0,Cash loans,M,0,225000.0,808650.0,26086.5,675000.0,State servant,Higher education,Married,House / apartment,-20659,-10455,-4998.0,-4010,WEDNESDAY,10,Culture,,0.62374,0.710674
4,122096,241559,0,Revolving loans,M,0,135000.0,180000.0,9000.0,180000.0,Commercial associate,Secondary / secondary special,Single / not married,House / apartment,-9013,-1190,-3524.0,-1644,SUNDAY,11,Construction,0.175511,0.492994,0.085595
