# Project Problem Statement:
A Company wants to automate the loan eligibility process based on customer details provided while filling online application form. The details filled by the customer are Gender, Marital Status, Education, Number of Dependents, Income of self and co applicant, Required Loan Amount, Required Loan Term, Credit History and others. The requirements are as follows:

1. Check eligibility of the Customer given the inputs described above.
2. Identify customer segments from given data and categorize customer into one of the segments.
3. If customer is not eligible for the input required amount and duration:

3.1 What can be amount for the given duration. 3.2 If duration is less than equal to 20 years, is customer eligible for required amount for some longer duration? What is that duration?

# Read Dataset

In [2]:
import pandas as pd
from warnings import filterwarnings
filterwarnings("ignore")

In [3]:
A=pd.read_csv("C:/Users/shree/Desktop/DataScience/Day 29 (Project Discussion, Multiclass Classification)/training_set.csv")

# Preview of Data

In [4]:
A

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001002,Male,No,0,Graduate,No,5849.0,0.0,,360.0,1.0,Urban,Y
1,LP001003,Male,Yes,1,Graduate,No,,1508.0,128.0,360.0,1.0,Rural,N
2,LP001005,Male,Yes,0,Graduate,Yes,3000.0,0.0,66.0,360.0,1.0,Urban,Y
3,LP001006,Male,Yes,0,Not Graduate,No,2583.0,2358.0,120.0,360.0,1.0,Urban,Y
4,LP001008,Male,No,0,Graduate,No,6000.0,0.0,141.0,360.0,1.0,Urban,Y
...,...,...,...,...,...,...,...,...,...,...,...,...,...
609,LP002978,Female,No,0,Graduate,No,2900.0,0.0,71.0,360.0,1.0,Rural,Y
610,LP002979,Male,Yes,3+,Graduate,No,4106.0,0.0,40.0,180.0,1.0,Rural,Y
611,LP002983,Male,Yes,1,Graduate,No,8072.0,240.0,253.0,360.0,1.0,Urban,Y
612,LP002984,Male,Yes,2,Graduate,No,7583.0,0.0,187.0,360.0,1.0,Urban,Y


In [5]:
A.head(10)

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001002,Male,No,0,Graduate,No,5849.0,0.0,,360.0,1.0,Urban,Y
1,LP001003,Male,Yes,1,Graduate,No,,1508.0,128.0,360.0,1.0,Rural,N
2,LP001005,Male,Yes,0,Graduate,Yes,3000.0,0.0,66.0,360.0,1.0,Urban,Y
3,LP001006,Male,Yes,0,Not Graduate,No,2583.0,2358.0,120.0,360.0,1.0,Urban,Y
4,LP001008,Male,No,0,Graduate,No,6000.0,0.0,141.0,360.0,1.0,Urban,Y
5,LP001011,,Yes,2,Graduate,Yes,5417.0,4196.0,267.0,360.0,1.0,Urban,Y
6,LP001013,Male,Yes,0,Not Graduate,No,2333.0,1516.0,95.0,360.0,1.0,Urban,Y
7,LP001014,Male,Yes,3+,Graduate,No,3036.0,2504.0,158.0,360.0,0.0,Semiurban,N
8,LP001018,Male,Yes,2,,No,4006.0,1526.0,168.0,360.0,1.0,Urban,Y
9,LP001020,Male,Yes,1,Graduate,No,12841.0,10968.0,349.0,360.0,1.0,Semiurban,N


In [6]:
A.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 614 entries, 0 to 613
Data columns (total 13 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   Loan_ID            614 non-null    object 
 1   Gender             599 non-null    object 
 2   Married            611 non-null    object 
 3   Dependents         599 non-null    object 
 4   Education          613 non-null    object 
 5   Self_Employed      582 non-null    object 
 6   ApplicantIncome    612 non-null    float64
 7   CoapplicantIncome  613 non-null    float64
 8   LoanAmount         592 non-null    float64
 9   Loan_Amount_Term   600 non-null    float64
 10  Credit_History     564 non-null    float64
 11  Property_Area      614 non-null    object 
 12  Loan_Status        614 non-null    object 
dtypes: float64(5), object(8)
memory usage: 62.5+ KB


In [7]:
A.shape

(614, 13)

# Missing Data Treatment

In [8]:
A.isna().sum()

Loan_ID               0
Gender               15
Married               3
Dependents           15
Education             1
Self_Employed        32
ApplicantIncome       2
CoapplicantIncome     1
LoanAmount           22
Loan_Amount_Term     14
Credit_History       50
Property_Area         0
Loan_Status           0
dtype: int64

In [9]:
from PM6 import replacer
replacer(A)

In [11]:
A.isna().sum()

Loan_ID              0
Gender               0
Married              0
Dependents           0
Education            0
Self_Employed        0
ApplicantIncome      0
CoapplicantIncome    0
LoanAmount           0
Loan_Amount_Term     0
Credit_History       0
Property_Area        0
Loan_Status          0
dtype: int64

# EDA
Separate Cat and con Columns

In [12]:
cat=[]
con=[]
for i in A.columns:
    if A[i].dtypes=="object":
        cat.append(i)
    else:
        con.append(i)

In [13]:
cat

['Loan_ID',
 'Gender',
 'Married',
 'Dependents',
 'Education',
 'Self_Employed',
 'Property_Area',
 'Loan_Status']

In [14]:
con

['ApplicantIncome',
 'CoapplicantIncome',
 'LoanAmount',
 'Loan_Amount_Term',
 'Credit_History']

# Define X & Y

In [15]:
Y=A['Loan_Status']

In [38]:
X=A.drop(labels=['Loan_Status'],axis=1)

In [39]:
X

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area
0,LP001002,Male,No,0,Graduate,No,5849.00000,0.0,146.412162,360.0,1.0,Urban
1,LP001003,Male,Yes,1,Graduate,No,5405.54085,1508.0,128.000000,360.0,1.0,Rural
2,LP001005,Male,Yes,0,Graduate,Yes,3000.00000,0.0,66.000000,360.0,1.0,Urban
3,LP001006,Male,Yes,0,Not Graduate,No,2583.00000,2358.0,120.000000,360.0,1.0,Urban
4,LP001008,Male,No,0,Graduate,No,6000.00000,0.0,141.000000,360.0,1.0,Urban
...,...,...,...,...,...,...,...,...,...,...,...,...
609,LP002978,Female,No,0,Graduate,No,2900.00000,0.0,71.000000,360.0,1.0,Rural
610,LP002979,Male,Yes,3+,Graduate,No,4106.00000,0.0,40.000000,180.0,1.0,Rural
611,LP002983,Male,Yes,1,Graduate,No,8072.00000,240.0,253.000000,360.0,1.0,Urban
612,LP002984,Male,Yes,2,Graduate,No,7583.00000,0.0,187.000000,360.0,1.0,Urban


# Skew

In [40]:
X.skew()

ApplicantIncome      6.538870
CoapplicantIncome    7.492000
LoanAmount           2.726601
Loan_Amount_Term    -2.389680
Credit_History      -1.963600
dtype: float64

In [41]:
S=X.skew()
skew=list(S[S>2].index)

In [42]:
skew

['ApplicantIncome', 'CoapplicantIncome', 'LoanAmount']

In [43]:
import numpy as np
from numpy import log
for j in skew:
    w=[]
    for i in X[j]:
        if(i!=0):
            w.append(np.log(i))
        else:
            w.append(i)
    X[j]=w

# One Hot Encoding

In [45]:
cat =['Gender', 'Married', 'Dependents', 'Education', 'Self_Employed', 'Property_Area']

In [46]:
X1=pd.get_dummies(X[cat])

In [47]:
X1

Unnamed: 0,Gender_Female,Gender_Male,Married_No,Married_Yes,Dependents_0,Dependents_1,Dependents_2,Dependents_3+,Education_Graduate,Education_Not Graduate,Self_Employed_No,Self_Employed_Yes,Property_Area_Rural,Property_Area_Semiurban,Property_Area_Urban
0,0,1,1,0,1,0,0,0,1,0,1,0,0,0,1
1,0,1,0,1,0,1,0,0,1,0,1,0,1,0,0
2,0,1,0,1,1,0,0,0,1,0,0,1,0,0,1
3,0,1,0,1,1,0,0,0,0,1,1,0,0,0,1
4,0,1,1,0,1,0,0,0,1,0,1,0,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
609,1,0,1,0,1,0,0,0,1,0,1,0,1,0,0
610,0,1,0,1,0,0,0,1,1,0,1,0,1,0,0
611,0,1,0,1,0,1,0,0,1,0,1,0,0,0,1
612,0,1,0,1,0,0,1,0,1,0,1,0,0,0,1


In [55]:
imp_cat=[]
imp_con=[]

from PM6 import chisq,ANOVA

for i in X.columns:
    if X[i].dtypes=="object":
        q=chisq(A,"Loan_Status",i)
        if (q < 0.05):
            imp_cat.append(i)
    else:
        q=ANOVA(A,"Loan_Status",i)
        if (q < 0.05):
            imp_con.append(i)

In [56]:
imp_cat

['Married', 'Education', 'Property_Area']

In [57]:
imp_con

['Credit_History']

# Preprocessing

In [58]:
from PM6 import preprocessing

imp_col=imp_con+imp_cat
Xnew=preprocessing(X[imp_col])

# Split Data Into Training & Testing set

In [59]:
from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(Xnew, Y, test_size=0.2, random_state=21)

In [60]:
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()

model=lr.fit(xtrain,ytrain)
pred=model.predict(xtest)

In [61]:
from sklearn.metrics import confusion_matrix
confusion_matrix(ytest,pred)

array([[19, 19],
       [ 4, 81]], dtype=int64)

In [62]:
from sklearn.metrics import accuracy_score
accuracy_score(ytest,pred)

0.8130081300813008

In [63]:
from sklearn.metrics import recall_score

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
ytest1 = le.fit_transform(ytest)
pred1 = le.transform(pred)

In [64]:
recall_score(ytest1,pred1)

0.9529411764705882

# Operations On Testing Data

In [65]:
B=pd.read_csv("C:/Users/shree/Desktop/DataScience/Day 29 (Project Discussion, Multiclass Classification)/testing_set.csv")

In [66]:
B.shape

(367, 12)

In [67]:
B.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 367 entries, 0 to 366
Data columns (total 12 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   Loan_ID            367 non-null    object 
 1   Gender             356 non-null    object 
 2   Married            367 non-null    object 
 3   Dependents         357 non-null    object 
 4   Education          367 non-null    object 
 5   Self_Employed      344 non-null    object 
 6   ApplicantIncome    367 non-null    int64  
 7   CoapplicantIncome  367 non-null    int64  
 8   LoanAmount         362 non-null    float64
 9   Loan_Amount_Term   361 non-null    float64
 10  Credit_History     338 non-null    float64
 11  Property_Area      367 non-null    object 
dtypes: float64(3), int64(2), object(7)
memory usage: 34.5+ KB


# Missing Data Tratment

In [68]:
B.isna().sum()

Loan_ID               0
Gender               11
Married               0
Dependents           10
Education             0
Self_Employed        23
ApplicantIncome       0
CoapplicantIncome     0
LoanAmount            5
Loan_Amount_Term      6
Credit_History       29
Property_Area         0
dtype: int64

In [69]:
replacer(B)

In [70]:
B.isna().sum()

Loan_ID              0
Gender               0
Married              0
Dependents           0
Education            0
Self_Employed        0
ApplicantIncome      0
CoapplicantIncome    0
LoanAmount           0
Loan_Amount_Term     0
Credit_History       0
Property_Area        0
dtype: int64

In [71]:
imp_col

['Credit_History', 'Married', 'Education', 'Property_Area']

In [72]:
Bnew=B[imp_col]

In [73]:
Bfinal=preprocessing(Bnew)

In [74]:
from sklearn.linear_model import LogisticRegression
lr=LogisticRegression()
model=lr.fit(Xnew,Y)

predicted=model.predict(Bfinal)

In [75]:
B["Pred_LoanStatus"]=predicted

In [76]:
B

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Pred_LoanStatus
0,LP001015,Male,Yes,0,Graduate,No,5720,0,110.0,360.0,1.000000,Urban,Y
1,LP001022,Male,Yes,1,Graduate,No,3076,1500,126.0,360.0,1.000000,Urban,Y
2,LP001031,Male,Yes,2,Graduate,No,5000,1800,208.0,360.0,1.000000,Urban,Y
3,LP001035,Male,Yes,2,Graduate,No,2340,2546,100.0,360.0,0.825444,Urban,Y
4,LP001051,Male,No,0,Not Graduate,No,3276,0,78.0,360.0,1.000000,Urban,Y
...,...,...,...,...,...,...,...,...,...,...,...,...,...
362,LP002971,Male,Yes,3+,Not Graduate,Yes,4009,1777,113.0,360.0,1.000000,Urban,Y
363,LP002975,Male,Yes,0,Graduate,No,4158,709,115.0,360.0,1.000000,Urban,Y
364,LP002980,Male,No,0,Graduate,No,3250,1993,126.0,360.0,0.825444,Semiurban,Y
365,LP002986,Male,Yes,0,Graduate,No,5000,2393,158.0,360.0,1.000000,Rural,Y


# If the customer is not eligible for the input required amount and duration:


# What can be the amount for the given duration.


In [77]:
A=pd.read_csv("C:/Users/shree/Desktop/DataScience/Day 29 (Project Discussion, Multiclass Classification)/training_set.csv")

In [78]:
A

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001002,Male,No,0,Graduate,No,5849.0,0.0,,360.0,1.0,Urban,Y
1,LP001003,Male,Yes,1,Graduate,No,,1508.0,128.0,360.0,1.0,Rural,N
2,LP001005,Male,Yes,0,Graduate,Yes,3000.0,0.0,66.0,360.0,1.0,Urban,Y
3,LP001006,Male,Yes,0,Not Graduate,No,2583.0,2358.0,120.0,360.0,1.0,Urban,Y
4,LP001008,Male,No,0,Graduate,No,6000.0,0.0,141.0,360.0,1.0,Urban,Y
...,...,...,...,...,...,...,...,...,...,...,...,...,...
609,LP002978,Female,No,0,Graduate,No,2900.0,0.0,71.0,360.0,1.0,Rural,Y
610,LP002979,Male,Yes,3+,Graduate,No,4106.0,0.0,40.0,180.0,1.0,Rural,Y
611,LP002983,Male,Yes,1,Graduate,No,8072.0,240.0,253.0,360.0,1.0,Urban,Y
612,LP002984,Male,Yes,2,Graduate,No,7583.0,0.0,187.0,360.0,1.0,Urban,Y


# Missing Data Tratment

In [79]:
replacer (A)

In [80]:
A.isna().sum()

Loan_ID              0
Gender               0
Married              0
Dependents           0
Education            0
Self_Employed        0
ApplicantIncome      0
CoapplicantIncome    0
LoanAmount           0
Loan_Amount_Term     0
Credit_History       0
Property_Area        0
Loan_Status          0
dtype: int64

In [81]:
A1=A[A["Loan_Status"]=="Y"]

In [82]:
A1

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001002,Male,No,0,Graduate,No,5849.0,0.0,146.412162,360.0,1.0,Urban,Y
2,LP001005,Male,Yes,0,Graduate,Yes,3000.0,0.0,66.000000,360.0,1.0,Urban,Y
3,LP001006,Male,Yes,0,Not Graduate,No,2583.0,2358.0,120.000000,360.0,1.0,Urban,Y
4,LP001008,Male,No,0,Graduate,No,6000.0,0.0,141.000000,360.0,1.0,Urban,Y
5,LP001011,Male,Yes,2,Graduate,Yes,5417.0,4196.0,267.000000,360.0,1.0,Urban,Y
...,...,...,...,...,...,...,...,...,...,...,...,...,...
608,LP002974,Male,Yes,0,Graduate,No,3232.0,1950.0,108.000000,360.0,1.0,Rural,Y
609,LP002978,Female,No,0,Graduate,No,2900.0,0.0,71.000000,360.0,1.0,Rural,Y
610,LP002979,Male,Yes,3+,Graduate,No,4106.0,0.0,40.000000,180.0,1.0,Rural,Y
611,LP002983,Male,Yes,1,Graduate,No,8072.0,240.0,253.000000,360.0,1.0,Urban,Y


In [83]:
A.skew()

ApplicantIncome      6.538870
CoapplicantIncome    7.492000
LoanAmount           2.726601
Loan_Amount_Term    -2.389680
Credit_History      -1.963600
dtype: float64

In [84]:
A.corr()[["LoanAmount"]].sort_values(by="LoanAmount")

Unnamed: 0,LoanAmount
Credit_History,-0.007738
Loan_Amount_Term,0.038801
CoapplicantIncome,0.187884
ApplicantIncome,0.565552
LoanAmount,1.0


# ANOVA

In [86]:
imp_cat1=[]


from PM6 import ANOVA

for i in A.columns:
      if A[i].dtypes=="object":
        q=ANOVA(A,i,"LoanAmount")
        if (q < 0.04):
            imp_cat1.append(i)

In [87]:
imp_con1=["ApplicantIncome"]

In [88]:
imp_cat1

['Gender', 'Married', 'Dependents', 'Education', 'Self_Employed']

# Define X & Y

In [90]:
Y1=A1[["LoanAmount"]]

In [91]:
w=pd.get_dummies(A1[imp_cat1])

In [92]:
Xnew=A1[imp_con1].join(w)

# spliting Data Set

In [94]:
from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(Xnew,Y1,test_size=0.2,random_state=21)

from statsmodels.api import OLS,add_constant
xconst=add_constant(xtrain)
ols=OLS(ytrain,xtrain)
model=ols.fit()
model.summary()

0,1,2,3
Dep. Variable:,LoanAmount,R-squared:,0.395
Model:,OLS,Adj. R-squared:,0.381
Method:,Least Squares,F-statistic:,26.81
Date:,"Wed, 25 May 2022",Prob (F-statistic):,7.000000000000001e-32
Time:,16:17:00,Log-Likelihood:,-1884.7
No. Observations:,337,AIC:,3787.0
Df Residuals:,328,BIC:,3822.0
Df Model:,8,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
ApplicantIncome,0.0084,0.001,12.693,0.000,0.007,0.010
Gender_Female,17.0663,5.839,2.923,0.004,5.580,28.552
Gender_Male,26.5504,4.641,5.721,0.000,17.421,35.679
Married_No,13.7230,4.968,2.762,0.006,3.950,23.496
Married_Yes,29.8936,4.641,6.442,0.000,20.764,39.023
Dependents_0,2.3518,5.752,0.409,0.683,-8.964,13.668
Dependents_1,17.5442,7.965,2.203,0.028,1.875,33.213
Dependents_2,9.9989,7.691,1.300,0.194,-5.131,25.128
Dependents_3+,13.7219,10.539,1.302,0.194,-7.010,34.454

0,1,2,3
Omnibus:,118.69,Durbin-Watson:,1.899
Prob(Omnibus):,0.0,Jarque-Bera (JB):,1105.925
Skew:,1.177,Prob(JB):,7.1e-241
Kurtosis:,11.557,Cond. No.,9.79e+19


In [95]:
P_VAL=list(model.pvalues.sort_values().index)

In [96]:
P_VAL

['ApplicantIncome',
 'Education_Graduate',
 'Married_Yes',
 'Gender_Male',
 'Self_Employed_Yes',
 'Self_Employed_No',
 'Gender_Female',
 'Married_No',
 'Education_Not Graduate',
 'Dependents_1',
 'Dependents_3+',
 'Dependents_2',
 'Dependents_0']

In [97]:
B1=B[B["Pred_LoanStatus"]=="N"]

In [98]:
B1

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Pred_LoanStatus
7,LP001056,Male,Yes,2,Not Graduate,No,3881,0,147.0,360.0,0.0,Rural,N
13,LP001094,Male,Yes,2,Graduate,No,12173,0,166.0,360.0,0.0,Semiurban,N
25,LP001153,Male,No,0,Graduate,No,0,24000,148.0,360.0,0.0,Rural,N
35,LP001203,Male,No,0,Graduate,No,3150,0,176.0,360.0,0.0,Semiurban,N
55,LP001313,Male,No,0,Graduate,No,2750,0,130.0,360.0,0.0,Urban,N
58,LP001323,Female,Yes,2,Graduate,No,2779,3664,176.0,360.0,0.0,Semiurban,N
63,LP001347,Female,No,0,Graduate,No,2101,1500,108.0,360.0,0.0,Rural,N
66,LP001352,Male,Yes,0,Not Graduate,No,4700,0,135.0,360.0,0.0,Semiurban,N
67,LP001358,Male,Yes,0,Graduate,No,3445,0,130.0,360.0,0.0,Semiurban,N
69,LP001361,Male,Yes,0,Graduate,No,2458,5105,188.0,360.0,0.0,Rural,N


In [99]:
L = pd.get_dummies(B1[imp_cat1])
Loan_amount=B1[imp_con1].join(L)

In [100]:
from sklearn.linear_model import LinearRegression
lm=LinearRegression()
model = lm.fit(Xnew,Y1)
predicted=model.predict(Loan_amount)


In [102]:
B1["Loan_Amount"]=predicted

In [103]:
B1

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Pred_LoanStatus,Loan_Amount
7,LP001056,Male,Yes,2,Not Graduate,No,3881,0,147.0,360.0,0.0,Rural,N,126.327984
13,LP001094,Male,Yes,2,Graduate,No,12173,0,166.0,360.0,0.0,Semiurban,N,215.055403
25,LP001153,Male,No,0,Graduate,No,0,24000,148.0,360.0,0.0,Rural,N,87.822026
35,LP001203,Male,No,0,Graduate,No,3150,0,176.0,360.0,0.0,Semiurban,N,114.549885
55,LP001313,Male,No,0,Graduate,No,2750,0,130.0,360.0,0.0,Urban,N,111.155871
58,LP001323,Female,Yes,2,Graduate,No,2779,3664,176.0,360.0,0.0,Semiurban,N,126.866592
63,LP001347,Female,No,0,Graduate,No,2101,1500,108.0,360.0,0.0,Rural,N,97.168687
66,LP001352,Male,Yes,0,Not Graduate,No,4700,0,135.0,360.0,0.0,Semiurban,N,123.80398
67,LP001358,Male,Yes,0,Graduate,No,3445,0,130.0,360.0,0.0,Semiurban,N,131.524773
69,LP001361,Male,Yes,0,Graduate,No,2458,5105,188.0,360.0,0.0,Rural,N,123.150044


In [104]:
T=pd.merge(left=B[["Loan_ID","Gender","Dependents","LoanAmount","Loan_Amount_Term","Pred_LoanStatus"]],right=B1[["Loan_ID","Loan_Amount"]],how="outer",left_on="Loan_ID",right_on="Loan_ID")
pd.set_option("display.max_rows",1000)

In [105]:
T

Unnamed: 0,Loan_ID,Gender,Dependents,LoanAmount,Loan_Amount_Term,Pred_LoanStatus,Loan_Amount
0,LP001015,Male,0,110.0,360.0,Y,
1,LP001022,Male,1,126.0,360.0,Y,
2,LP001031,Male,2,208.0,360.0,Y,
3,LP001035,Male,2,100.0,360.0,Y,
4,LP001051,Male,0,78.0,360.0,Y,
5,LP001054,Male,0,152.0,360.0,Y,
6,LP001055,Female,1,59.0,360.0,Y,
7,LP001056,Male,2,147.0,360.0,N,126.327984
8,LP001059,Male,2,280.0,240.0,Y,
9,LP001067,Male,0,123.0,360.0,Y,


In [106]:
T.to_excel("LoanStatus1.xlsx")
T.to_csv("LoanStatus1.csv")