# Hypothesis Testing -Power

The test will calculate a p-value that can be interpreted as to whether the samples are the same (fail to reject the null hypothesis), or there is a statistically significant difference between the samples (reject the null hypothesis). A common significance level for interpreting the p-value is 5% or 0.05.

**Significance level (alpha)**: 5% or 0.05.
The size of the effect of comparing two groups can be quantified with an effect size measure. A common measure for comparing the difference in the mean from two groups is the Cohen’s d measure. It calculates a standard score that describes the difference in terms of the number of standard deviations that the means are different. A large effect size for Cohen’s d is 0.80 or higher, as is commonly accepted when using the measure.

**Effect Size**: Cohen’s d of at least 0.80.
We can use the default and assume a minimum statistical power of 80% or 0.8.

**Statistical Power**: 80% or 0.80.
For a given experiment with these defaults, we may be interested in estimating a suitable sample size. That is, how many observations are required from each sample in order to at least detect an effect of 0.80 with an 80% chance of detecting the effect if it is true (20% of a Type II error) and a 5% chance of detecting an effect if there is no such effect (Type I error).

We can solve this using a power analysis.



A note on sample size: the function has an argument called ratio that is the ratio of the number of samples in one sample to the other. If both samples are expected to have the same number of observations, then the ratio is 1.0. If, for example, the second sample is expected to have half as many observations, then the ratio would be 0.5.



#### Libraries

In [2]:
import numpy as np
import pandas as pd
from scipy import stats
from sklearn import datasets

In [3]:
from statsmodels.stats.power import TTestIndPower

effect_size = 0.5
alpha = 0.05
power = 0.8
# perform power analysis
analysis = TTestIndPower()
sample_size = analysis.solve_power(effect_size, power=power, nobs1=None, ratio=1.0, alpha=alpha)
print('Sample Size: %.3f' % sample_size)

Sample Size: 63.766
