# Effect Size and Power (Instructional Worksheet)

##  Measures of Effect Size

We will go through and calculate the effect size using the various methods for the different statistical tests we have learned about in the last couple of modules. We will use the same datasets that we used in previous modules so that we can examine the effect size for the tests that we previously ran and see how that influences our conclusions.

We will calculate each effect size using two different methods. First, directly using the equation from the textbook, and second using a function in the *sms* package.

## Two Sample T-test

For the two sample t-test in module 12, we were interested in the mean weight of chicks that were given diet 1 versus diet 2, from the *ChickWeight* dataset. We concluded that the mean weight was significantly different for the chicks on diet 1 vs. diet 2. In fact, we figured out that the mean weight for diet 2 was significantly higher than for diet 1. 

So now, we are interested in the effect size for this difference. Is the effect size large enough that it would be worth it for someone to change their chicks diet?

#### Cohen's *d*

The formula for Cohen's *d* is as follows:
$$d = (M_1 - M_2) / S_{DV}$$

$M_1$ and $M_2$ refers to the mean of the two samples

$S_{DV}$ refers to the standard deviation of the dependent variable for all samples

In [28]:
import pandas as pd
import numpy as np
import statsmodels.stats as sms

ChickWeight = pd.read_csv('../data/ChickWeight.csv')

In [26]:
M1  = ChickWeight[ChickWeight.Diet==1].weight.mean()
M2  = ChickWeight[ChickWeight.Diet==2].weight.mean()
SDV = ChickWeight.loc[(ChickWeight['Diet']==1) | (ChickWeight['Diet']==2), 
                      'weight'].values.std()
d   = (M1-M2)/SDV
d

-0.31763006239991504

Our results give us a Cohen's *d* value of -0.317. Using the guidelines from the textbook, we conclude that this is a small effect. 

#### Effect size *r*

The formula for the effect size *r* is as follows:
$$r = sqrt(t^2/(t^2 + df))$$
$t^2$ refers to t-statistic calculated from the t-test

$df$ refers to the degrees of freedom of the t-test

In [33]:
ttest = sms.weightstats.ttest_ind(ChickWeight[ChickWeight.Diet==1].weight,
                                  ChickWeight[ChickWeight.Diet==2].weight,
                                  usevar='unequal')
ttest

# https://www.statsmodels.org/dev/generated/statsmodels.stats.weightstats.ttest_ind.html

AttributeError: module 'statsmodels.stats' has no attribute 'weightstats'