# Homoscedasticity 

Homoscedastiscity, Homogeneity of Residuals, or Homogeneity of Variance is the assumption that the variance between point and the regression line is the same for all values of the predictor variable. If this assumption fails to be met then a data set is said to be Heteroscedastious. 

![image.png](attachment:image.png)
Image from [Wikipedia](https://www.google.com/url?sa=i&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FHomoscedasticity&psig=AOvVaw1W-BtLnV3L12kmqbHY0re-&ust=1579306025716000&source=images&cd=vfe&ved=0CA0QjhxqFwoTCMi4kuCriecCFQAAAAAdAAAAABAD)

![image.png](attachment:image.png)
Image from [Wikipedia](https://en.wikipedia.org/wiki/Heteroscedasticity)

## Tests for Homoscedasticity

In [59]:
# Uncomment this line upon first running the notebook if these packages 
# are not installed locally.
#!pip install --user numpy pandas scipy statsmodels

import numpy as np
import pandas as pd
import scipy
import scipy.stats as ss

import statsmodels
import statsmodels.api as sm
from statsmodels.formula.api import ols

import plotly.express as px
from plotly.subplots import make_subplots

#First we import the dataset and extract the data frame
duncan_dataset = sm.datasets.get_rdataset("Duncan", "carData").data

#and extract the data we want
prestige = list(duncan_dataset["prestige"])
income = list(duncan_dataset["income"])
education = list(duncan_dataset["education"])


In [62]:
for column in ["income", "education"]:
    fig = px.scatter(duncan_dataset, x=column,y="prestige", trendline="ols")
    fig.show()    

### Bartlett’s Test

Bartlett's test statistic is very large and very scary but 

$$\chi ^{2}={\frac {(N-k)\ln(S_{p}^{2})-\sum _{{i=1}}^{k}(n_{i}-1)\ln(S_{i}^{2})}{1+{\frac {1}{3(k-1)}}\left(\sum _{{i=1}}^{k}({\frac {1}{n_{i}-1}})-{\frac {1}{N-k}}\right)}}$$

In [57]:
ss.bartlett(*[income, prestige, education])

BartlettResult(statistic=2.9585556871708203, pvalue=0.2278021377726252)

Interpreting this p-value we can say that the probability of 

### Box’s M Test

### Brown-Forsythe Test

### Hartley’s Fmax test

### Levene’s Test

$$ {\displaystyle W={\frac {(N-k)}{(k-1)}}\cdot {\frac {\sum _{i=1}^{k}N_{i}(Z_{i\cdot }-Z_{\cdot \cdot })^{2}}{\sum _{i=1}^{k}\sum _{j=1}^{N_{i}}(Z_{ij}-Z_{i\cdot })^{2}}},}$$

Leven's test is more robust and preferable to Bartlett's test when applied to non-normal distributions.

In [58]:
ss.levene(*[income, prestige, education])

LeveneResult(statistic=2.7711057214580848, pvalue=0.06623739995836857)

## Sources


[Homoscedasticity](https://www.statisticshowto.datasciencecentral.com/homoscedasticity/)
[]()
[]()