# Personal Loan 대출상품 모델 (Odds Ratio)
Bank Customer accepts Personal Loans

**2017-2023 [FinanceData]()**

# Bank Customer accepts Personal Loans 데이터셋

UniversalBank.xls

https://gist.github.com/880e4ef1f395025be56b7370bce068ff

In [None]:
!curl -L "http://bit.ly/2EtPI5T" -o "UniversalBank.xls"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100   218  100   218    0     0  10380      0 --:--:-- --:--:-- --:--:-- 10380
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  633k  100  633k    0     0  7193k      0 --:--:-- --:--:-- --:--:-- 7193k


In [None]:
import numpy as np
import pandas as pd

loans = pd.read_excel('UniversalBank.xls', sheet_name='Data', skiprows=3)
loans.head()

Unnamed: 0,ID,Age,Experience,Income,ZIP Code,Family,CCAvg,Education,Mortgage,Personal Loan,Securities Account,CD Account,Online,CreditCard
0,1,25,1,49,91107,4,1.6,1,0,0,1,0,0,0
1,2,45,19,34,90089,3,1.5,1,0,0,1,0,0,0
2,3,39,15,11,94720,1,1.0,1,0,0,0,0,0,0
3,4,35,9,100,94112,1,2.7,2,0,0,0,0,0,0
4,5,35,8,45,91330,4,1.0,2,0,0,0,0,0,1


In [None]:
len(loans)

5000

## 학습 데이터, 테스트 데이터 구분

In [None]:
x_train = loans.drop('Personal Loan', axis=1)
y_train = loans['Personal Loan']

In [None]:
x_train

Unnamed: 0,ID,Age,Experience,Income,ZIP Code,Family,CCAvg,Education,Mortgage,Securities Account,CD Account,Online,CreditCard
0,1,25,1,49,91107,4,1.6,1,0,1,0,0,0
1,2,45,19,34,90089,3,1.5,1,0,1,0,0,0
2,3,39,15,11,94720,1,1.0,1,0,0,0,0,0
3,4,35,9,100,94112,1,2.7,2,0,0,0,0,0
4,5,35,8,45,91330,4,1.0,2,0,0,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
4995,4996,29,3,40,92697,1,1.9,3,0,0,0,1,0
4996,4997,30,4,15,92037,4,0.4,1,85,0,0,1,0
4997,4998,63,39,24,93023,2,0.3,3,0,0,0,0,0
4998,4999,65,40,49,90034,3,0.5,2,0,0,0,1,0


In [None]:
y_train

0       0
1       0
2       0
3       0
4       0
       ..
4995    0
4996    0
4997    0
4998    0
4999    0
Name: Personal Loan, Length: 5000, dtype: int64

## statsmodels 로지스틱 회귀



In [None]:
import statsmodels.api as sm

logit = sm.Logit(y_train, x_train)
results = logit.fit()

results.summary()

  import pandas.util.testing as tm


Optimization terminated successfully.
         Current function value: 0.129635
         Iterations 9


0,1,2,3
Dep. Variable:,Personal Loan,No. Observations:,5000.0
Model:,Logit,Df Residuals:,4987.0
Method:,MLE,Df Model:,12.0
Date:,"Sun, 16 Jan 2022",Pseudo R-squ.:,0.59
Time:,15:51:13,Log-Likelihood:,-648.18
converged:,True,LL-Null:,-1581.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
ID,-5.533e-05,5.12e-05,-1.081,0.280,-0.000,4.5e-05
Age,-0.1853,0.055,-3.368,0.001,-0.293,-0.077
Experience,0.1919,0.055,3.480,0.001,0.084,0.300
Income,0.0536,0.003,20.784,0.000,0.049,0.059
ZIP Code,-9.075e-05,1.52e-05,-5.978,0.000,-0.000,-6.1e-05
Family,0.6719,0.073,9.158,0.000,0.528,0.816
CCAvg,0.1196,0.039,3.034,0.002,0.042,0.197
Education,1.7318,0.114,15.206,0.000,1.509,1.955
Mortgage,0.0005,0.001,0.861,0.389,-0.001,0.002


# 로지스틱 회귀 모델 - 결과 및 해석
coef (로지스틱 회귀계수, 추정된 파라미터 값)
* 변수가 1증가할 때 log(Odds)의 증가량
* 양수→성공확률과 양의 상관관계(성공확률≥1), 음수→성공확률과 음의 상관관계(0≤성공확률<1)

std err (추정 파라미터의 표준편차)
* 추정 파라미터의 신뢰구간 구축에 사용(구간추정)

P > |z| (유의확률)
* 해당변수가 통계적으로 유의미한지 여부를 알려주는 지표
* 해당 파라미터 값이 0인지 여부 (가설검정)
* 0에 가까우면 귀무가설(H0)을 기각할 수 있으므로 "해당 변수의 영향력이 있다"라고 판단

Odds Ratio (승산비율)
* 하나의 변수가 1단위 증가할 때 변화하는 Odds의 비율

In [None]:
results.params

ID                   -0.000055
Age                  -0.185266
Experience            0.191855
Income                0.053609
ZIP Code             -0.000091
Family                0.671892
CCAvg                 0.119647
Education             1.731761
Mortgage              0.000477
Securities Account   -0.978336
CD Account            3.868096
Online               -0.696858
CreditCard           -1.154681
dtype: float64

In [None]:
result = pd.DataFrame(results.params, columns=['coef'])
result

Unnamed: 0,coef
ID,-5.5e-05
Age,-0.185266
Experience,0.191855
Income,0.053609
ZIP Code,-9.1e-05
Family,0.671892
CCAvg,0.119647
Education,1.731761
Mortgage,0.000477
Securities Account,-0.978336


In [None]:
result['odds'] = np.exp(result['coef'])
result

Unnamed: 0,coef,odds
ID,-5.5e-05,0.999945
Age,-0.185266,0.830883
Experience,0.191855,1.211495
Income,0.053609,1.055072
ZIP Code,-9.1e-05,0.999909
Family,0.671892,1.957937
CCAvg,0.119647,1.127099
Education,1.731761,5.650599
Mortgage,0.000477,1.000477
Securities Account,-0.978336,0.375936


* 경력(Experience)이 1증가할때 Odds Ratio가 1.65배 증가 (경력이 1년 증가하면 대출 판매 1.65배 증가)
* 신용카드(CreditCard)값이 1증가할때 Odds Ratio가 0.31배 증가 (신용카드 보유한 경우 대출 확률 감소)

----
**2017-2023 [FinanceData]()**