一般使用怀特(`white`)检验。但这里也展示了`bp检验`  
这两者的`原假设`都是`不存在异方差`

# 载入模型

In [12]:
import statsmodels.api as sm
import statsmodels.formula.api as smf

# 网络问题请打开代理
df = sm.datasets.get_rdataset("Guerry", "HistData").data
vars = ['Department', 'Lottery', 'Literacy', 'Wealth', 'Region']
df = df[vars]
df = df.dropna() # 去除掉有空值的行
# print(df["Region"].unique()) # 这列到时候经过R型公式后会自动将一个类型(如`c`)变为对照，其余作为虚拟变量
mod = smf.ols("Lottery ~ Literacy + Wealth + Region",data=df).fit()
print(mod.summary())

                            OLS Regression Results                            
Dep. Variable:                Lottery   R-squared:                       0.338
Model:                            OLS   Adj. R-squared:                  0.287
Method:                 Least Squares   F-statistic:                     6.636
Date:                Sun, 13 Feb 2022   Prob (F-statistic):           1.07e-05
Time:                        12:10:57   Log-Likelihood:                -375.30
No. Observations:                  85   AIC:                             764.6
Df Residuals:                      78   BIC:                             781.7
Df Model:                           6                                         
Covariance Type:            nonrobust                                         
                  coef    std err          t      P>|t|      [0.025      0.975]
-------------------------------------------------------------------------------
Intercept      38.6517      9.456      4.087      

# 异方差检验

In [5]:
from statsmodels.stats.diagnostic import het_white
from statsmodels.stats.diagnostic import het_breuschpagan # bp 检验

white_test = het_white(mod.resid,mod.model.exog)
labels = ['Test Statistic', 'Test Statistic p-value', 'F-Statistic', 'F-Test p-value']
print(dict(zip(labels, white_test)))

{'Test Statistic': 18.825547437080086, 'Test Statistic p-value': 0.33863702258957395, 'F-Statistic': 1.1212001268074996, 'F-Test p-value': 0.3536196487868904}


# 处理

## 采用`ols`+`稳健的标准误`

In [14]:
mod_robust = mod.get_robustcov_results(cov_type="HC1") # HC1 也是stata里robust的默认算法
print(mod_robust.summary())

                            OLS Regression Results                            
Dep. Variable:                Lottery   R-squared:                       0.338
Model:                            OLS   Adj. R-squared:                  0.287
Method:                 Least Squares   F-statistic:                     9.165
Date:                Sun, 13 Feb 2022   Prob (F-statistic):           1.38e-07
Time:                        12:41:11   Log-Likelihood:                -375.30
No. Observations:                  85   AIC:                             764.6
Df Residuals:                      78   BIC:                             781.7
Df Model:                           6                                         
Covariance Type:                  HC2                                         
                  coef    std err          t      P>|t|      [0.025      0.975]
-------------------------------------------------------------------------------
Intercept      38.6517      8.610      4.489      