# Effect Size and Power (Instructional Worksheet)

##  Measures of Effect Size

We will go through and calculate the effect size using the various methods for the different statistical tests we have learned about in the last couple of modules. We will use the same datasets that we used in previous modules so that we can examine the effect size for the tests that we previously ran and see how that influences our conclusions.

We will calculate each effect size using two different methods. First, directly using the equation from the textbook, and second using a function in the *sms* package.

## Two Sample T-test

For the two sample t-test in module 12, we were interested in the mean weight of chicks that were given diet 1 versus diet 2, from the *ChickWeight* dataset. We concluded that the mean weight was significantly different for the chicks on diet 1 vs. diet 2. In fact, we figured out that the mean weight for diet 2 was significantly higher than for diet 1. 

So now, we are interested in the effect size for this difference. Is the effect size large enough that it would be worth it for someone to change their chicks diet?

#### Cohen's *d*

The formula for Cohen's *d* is as follows:
$$d = (M_1 - M_2) / S_{DV}$$

$M_1$ and $M_2$ refers to the mean of the two samples

$S_{DV}$ refers to the standard deviation of the dependent variable for all samples

In [11]:
import pandas as pd
import statsmodels.stats as sms
from statsmodels.formula.api import ols

ChickWeight = pd.read_csv('../data/ChickWeight.csv')

In [12]:
M1  = ChickWeight[ChickWeight.Diet==1].weight.mean()
M2  = ChickWeight[ChickWeight.Diet==2].weight.mean()
SDV = ChickWeight.loc[(ChickWeight['Diet']==1) | (ChickWeight['Diet']==2), 
                      'weight'].values.std()
d   = (M1-M2)/SDV
d

-0.31763006239991504

Our results give us a Cohen's *d* value of -0.317. Using the guidelines from the textbook, we conclude that this is a small effect. 

#### Effect size *r*

The formula for the effect size *r* is as follows:
$$r = sqrt(t^2/(t^2 + df))$$
$t^2$ refers to t-statistic calculated from the t-test

$df$ refers to the degrees of freedom of the t-test

In [13]:
sms.weightstats.ttest_ind(ChickWeight[ChickWeight.Diet==1].weight,
                          ChickWeight[ChickWeight.Diet==2].weight,
                          usevar='unequal')

(-2.6378338635729652, 0.008995383023243087, 201.38394730819803)

In [14]:
ttest = sms.weightstats.ttest_ind(ChickWeight[ChickWeight.Diet==1].weight,
                                  ChickWeight[ChickWeight.Diet==2].weight,
                                  usevar='unequal')

t = ttest[0]
df = ttest[2]
r = np.sqrt(t**2/(t**2 + df))
r

0.1827506394211888

Our results give us an effect size *r* value of -0.183. Using the guidelines from the textbook, we conclude, again, that this is a small effect. 

## ANOVA *F* Tests

For the ANOVA analysis in module 13, we were again interested in the mean weight of chicks that were given various diets, from the *ChickWeight* dataset. We concluded that there was a significant difference in mean weight between at least two of the different diets.

Again, we are interested in the effect size for this difference. Is the effect size large enough that it would be worth it for someone to change their chicks diet?

#### Effect size $\eta$
The formula for the effect size $\eta$ is as follows:
$$\eta = sqrt(SS_{bet}/SS_{tot})$$
$SS{bet}$ refers to the sum of squares between samples (i.e., sum of squares for the treatment)
$SS{tot}$ refers to the total sum of squares (i.e., sum of squares for the treatment and residuals combined)

In [20]:
cw_lm   = ols('weight ~ C(Diet)', data=ChickWeight).fit()
r       = sms.anova.anova_lm(cw_lm)
SS_bet  = r.sum_sq[0] # sum of squares for diet
SS_with = r.sum_sq[1] # sum of squares for residuals
SS_tot  = SS_bet + SS_with
eta     = np.sqrt(SS_bet/SS_tot)
eta

0.23125165093346575

Our results give us an effect size $\eta$ value of 0.231. Using the guidelines from the textbook, we conclude that this is a small effect.

## Chi-square Tests
For the Chi-square Tests in module 15, we used our own flower data to (1) see if our data follows the expected distribution and (2) test for independence between two or more variables.

We are interested in the effect size for our chi-square statistic - is the difference meaningful in practical terms?
We use a different statistic depending on the number of variables of interest.

#### Goodness of Fit Test (One Variable)
The formula for the effect size $r$ is as follows:
$$r=sqrt(\chi^2/((N)(c-1))$$
$\chi^2$ refers to the chi-squared statistic
$N$ refers to the total sample size
$c$ refers to the number of categories

In [31]:
from scipy import stats

flower = pd.DataFrame(
    {'color': ['red', 'white'],
     'freq' : [705, 224]})

# we can get chisquare from statsmodels library as well
chi = stats.chisquare(flower.freq, 
                      [.75*flower.freq.sum(), .25*flower.freq.sum()])

N = 929
c = 2
chistat = chi[0]

r = np.sqrt(chistat/(N*(c-1)))
r

0.0205086747936035

Our results give us an effect size $r$ value of 0.021. Using the guidelines from the textbook, we conclude that this is a small effect.

#### Test for Independence (two-variable)
The effect size statistics for the test of independence depends on the number of categories on each variable. 
In module 15, we had 2 categories on 2 variables. In this situation we use $\phi$ as our effect size statistic.
The formula for the effect size $\phi$ is as follows:
$$\phi = sqrt(\chi^2/N)$$

$\chi^2$ refers to the chi-squared statistic
$N$ refers to the total sample size

In [38]:
flower['surv'] = [448,103]
chi = stats.chi2_contingency(flower[['freq', 'surv']])
N = 929
chi_st = chi[0]
phi = np.sqrt(chi_st/N)
phi

0.07756509358793227

Our results give us an effect size $\phi$ value of 0.0775. Using the guidelines from the textbook, we conclude that this is a small effect.

If we have more than 2 categories for each variable, we instead use Cramer's *V* as our effect size statistics.
Let's suppose that we also have pink flowers. There are 97 pink flowers, and 78 of their plants survived for the season. First, add this information to your data frame and then rerun your chi-squared analysis. 

In [40]:
flower2 = pd.DataFrame({'color': ['red','white','pink'],
                        'freq': [705,224,97],
                        'surv': [448,103,78]})
chi2 = stats.chi2_contingency(flower2[['freq', 'surv']])
chi_st2 = chi2[0]

In this situation, we will use Cramer's *V* as our effect size statistic. The formula is as follows:
$$V = sqrt(\phi^2/(the\ smaller\ of\ R\ or\ C)-1))$$
$R$ is the number of rows
$C$ is the number of columns
$\phi$ is calculated from the formula above  

So, first lets calculate $\phi$ and then we can calculate *V*.

In [42]:
N = 929
phi = np.sqrt(chi_st2/N)
phi2 = phi**2
#R = 3, C = 3 - 3 rows and 3 columns
V = np.sqrt(phi2/(3-1))
V

0.0712480211571941

Our results give us an effect size $V$ value of 0.0712. Using the guidelines from the textbook, we conclude that this is a small effect.

## Power

Once we have the effect size statistic, we can calculate power. Recall that power is we correctly reject the null hypothesis that is in fact false. Ideally a power > 80% is desired.

#### Power for 2-sample t-test with unequal sample size

We will use the function *pwr.t2n.test()* to calculate power. We will use the sample size, and Cohen's effect size (*d*) from above to calculate power. We will use a significance level of 5% (type 1 error probability).

In [46]:
sms.power.tt_ind_solve_power(effect_size=100, nobs1=220, 
                             power=None, alpha=0.05, ratio=0.5)

1.0