# 3.3 多元线性回归

# 3.3.1 多元线性回归的数学原理和代码实现

1.案例背景

这里以信用卡客户的客户价值来解释下客户价值预测的具体含义：客户价值预测就是指客户未来一段时间能带来多少利润，其利润的来源可能来自于信用卡的年费、取现手续费、分期手续费、境外交易手续费用等。而分析出客户的价值后，在进行营销、电话接听、催收、产品咨询等各项服务时，就可以针对高价值的客户进行区别于普通客户的服务，有助于进一步挖掘这些高价值客户的价值，并提高这些高价值客户的忠诚度。

2.读取数据

In [1]:
import pandas as pd
df = pd.read_excel('客户价值数据表.xlsx')
df.head()  # 显示前5行数据

Unnamed: 0,客户价值,历史贷款金额,贷款次数,学历,月收入,性别
0,1150,6488,2,2,9567,1
1,1157,5194,4,2,10767,0
2,1163,7066,3,2,9317,0
3,983,3550,3,2,10517,0
4,1205,7847,3,3,11267,1


In [2]:
X = df[['历史贷款金额', '贷款次数', '学历', '月收入', '性别']]
Y = df['客户价值']

3.模型搭建

In [3]:
from sklearn.linear_model import LinearRegression
regr = LinearRegression()
regr.fit(X,Y)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

4.线性回归方程构造

In [4]:
regr.coef_

array([5.71421731e-02, 9.61723492e+01, 1.13452022e+02, 5.61326459e-02,
       1.97874093e+00])

In [5]:
print('各系数为:' + str(regr.coef_))
print('常数项系数k0为:' + str(regr.intercept_))

各系数为:[5.71421731e-02 9.61723492e+01 1.13452022e+02 5.61326459e-02
 1.97874093e+00]
常数项系数k0为:-208.42004079958383


其中这里通过regr.coef_获得是一个系数列表，分别对应不同特征变量前面的系数，也即k1、k2、k3、k4及k5，所以此时的多元线性回归曲线方程为：

y = -208 + 0.057*x1 + 96*x2 + 113*x3 + 0.056*x4 + 1.97*x5

5.模型评估

In [6]:
import statsmodels.api as sm  # 引入线性回归模型评估相关库
X2 = sm.add_constant(X)
est = sm.OLS(Y, X2).fit()
est.summary()

  return ptp(axis=axis, out=out, **kwargs)


0,1,2,3
Dep. Variable:,客户价值,R-squared:,0.571
Model:,OLS,Adj. R-squared:,0.553
Method:,Least Squares,F-statistic:,32.44
Date:,"Wed, 01 Jan 2020",Prob (F-statistic):,6.41e-21
Time:,19:43:51,Log-Likelihood:,-843.5
No. Observations:,128,AIC:,1699.0
Df Residuals:,122,BIC:,1716.0
Df Model:,5,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-208.4200,163.810,-1.272,0.206,-532.699,115.859
历史贷款金额,0.0571,0.010,5.945,0.000,0.038,0.076
贷款次数,96.1723,25.962,3.704,0.000,44.778,147.567
学历,113.4520,37.909,2.993,0.003,38.406,188.498
月收入,0.0561,0.019,2.941,0.004,0.018,0.094
性别,1.9787,32.286,0.061,0.951,-61.934,65.891

0,1,2,3
Omnibus:,1.597,Durbin-Watson:,2.155
Prob(Omnibus):,0.45,Jarque-Bera (JB):,1.538
Skew:,0.264,Prob(JB):,0.464
Kurtosis:,2.9,Cond. No.,128000.0
