#Binary Logistic Regression

Develop a model to predict the non payment of overdrafts by
customers of a multinational banking institution. The data collected is given in
Logistic_Reg.csv file. The factors and response considered are given below.

### Step-1: Import packages

In [1]:
from google.colab import files
import io
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
from sklearn.linear_model import LogisticRegression
import statsmodels.api as mysm

### Step-2: read data

In [2]:
uploaded = files.upload()
data = pd.read_csv(io.BytesIO(uploaded['Logistic_Reg.csv']))
data

Saving Logistic_Reg.csv to Logistic_Reg.csv


Unnamed: 0,Ind_Exp_Act_Score,Tran_Speed_Score,Peer_Comb_Score,Outcome
0,6.2,9.3,7.4,1
1,2.6,2.2,8.7,1
2,9.5,1.5,8.2,1
3,2.6,5.0,0.4,0
4,10.0,7.7,7.2,1
...,...,...,...,...
975,6.7,2.7,1.8,0
976,8.3,9.7,5.6,1
977,2.3,0.7,5.5,0
978,0.9,4.8,3.0,0


In [3]:
x = data[["Ind_Exp_Act_Score","Tran_Speed_Score","Peer_Comb_Score"]]
x

Unnamed: 0,Ind_Exp_Act_Score,Tran_Speed_Score,Peer_Comb_Score
0,6.2,9.3,7.4
1,2.6,2.2,8.7
2,9.5,1.5,8.2
3,2.6,5.0,0.4
4,10.0,7.7,7.2
...,...,...,...
975,6.7,2.7,1.8
976,8.3,9.7,5.6
977,2.3,0.7,5.5
978,0.9,4.8,3.0


In [4]:
y = data.Outcome
y

0      1
1      1
2      1
3      0
4      1
      ..
975    0
976    1
977    0
978    0
979    0
Name: Outcome, Length: 980, dtype: int64

In [5]:
x["Intecept"] = 1
x

Unnamed: 0,Ind_Exp_Act_Score,Tran_Speed_Score,Peer_Comb_Score,Intecept
0,6.2,9.3,7.4,1
1,2.6,2.2,8.7,1
2,9.5,1.5,8.2,1
3,2.6,5.0,0.4,1
4,10.0,7.7,7.2,1
...,...,...,...,...
975,6.7,2.7,1.8,1
976,8.3,9.7,5.6,1
977,2.3,0.7,5.5,1
978,0.9,4.8,3.0,1


###Step-3: Developing the model

In [6]:
model = mysm.Logit(y,x)
model
result = model.fit()
result.summary()

Optimization terminated successfully.
         Current function value: 0.064710
         Iterations 12


0,1,2,3
Dep. Variable:,Outcome,No. Observations:,980.0
Model:,Logit,Df Residuals:,976.0
Method:,MLE,Df Model:,3.0
Date:,"Thu, 16 Mar 2023",Pseudo R-squ.:,0.8903
Time:,07:59:45,Log-Likelihood:,-63.416
converged:,True,LL-Null:,-577.85
Covariance Type:,nonrobust,LLR p-value:,9.799e-223

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Ind_Exp_Act_Score,2.7957,0.355,7.867,0.000,2.099,3.492
Tran_Speed_Score,2.7532,0.343,8.032,0.000,2.081,3.425
Peer_Comb_Score,3.5153,0.434,8.095,0.000,2.664,4.366
Intecept,-35.5062,4.406,-8.058,0.000,-44.142,-26.870


### Step-4: Printing the Predicted values 

In [7]:
pred = result.predict(x)
pred

0      1.000000
1      0.999776
2      1.000000
3      0.000002
4      1.000000
         ...   
975    0.046811
976    1.000000
977    0.000404
978    0.000098
979    0.075612
Length: 980, dtype: float64

In [8]:
output = pd.DataFrame(pred)
output

Unnamed: 0,0
0,1.000000
1,0.999776
2,1.000000
3,0.000002
4,1.000000
...,...
975,0.046811
976,1.000000
977,0.000404
978,0.000098
