# Power

Power relates to the ability to detect the presence of a true effect and is an important component of experimental design. We will consider a general-purpose simulation approach to estimating the power of an experimental design. We will also investigate how we can visualize such data using line plots and two-dimensional image plots.

## Calculating power given effect size and sample size

We will begin by considering a scenario in which we have an effect size and sample size in mind and we would like to know the associated power.

The key to determining power using a simulation approach is to perform a large number of simulated experiments, each time calculating our test statistic (independent samples t-test, in this case) and accumulating the number of times we reject the null hypothesis. The power is simply the proportion of times that we are able to reject the null hypothesis (remembering that we control the population means and we know that there is a true difference).

In [10]:
import numpy as np

import scipy.stats

n_per_group = 25

# effect size = 0.8
group_means = [0.0, .8]
group_sigmas = [1.0, 1.0]

n_groups = len(group_means)

# number of simulations
n_sims = 100

# store the p value for each simulation
sim_p = np.empty(n_sims)
sim_p.fill(np.nan)

for i_sim in range(n_sims):

    data = np.empty([n_per_group, n_groups])
    data.fill(np.nan)

    # simulate the data for this 'experiment'
    for i_group in range(n_groups):

        data[:, i_group] = np.random.normal(
            loc=group_means[i_group],
            scale=group_sigmas[i_group],
            size=n_per_group
        )

    result = scipy.stats.ttest_ind(data[:, 0], data[:, 1])

    sim_p[i_sim] = result[1]

# number of simulations where the null was rejected
n_rej = np.sum(sim_p < 0.05)

prop_rej = n_rej / float(n_sims)

print ("Power: ", prop_rej)

Power:  0.84


Our power to detect a large effect size with 30 participants per group is about 86%. That is, if a large effect size is truly present then we would expect to be able to reject the null hypothesis (at an alpha of 0.05) about 86% of the time.

## Required sample size to achieve a given power for a given effect size

Oftentimes you want to know how many samples you need to achieve a desired level of power (80% is typical). Again, suppose you have a good guess for the effect size that you expect to see. We then simply increase the sample size until we get the desired level of power.

In [2]:
import numpy as np

import scipy.stats

# start at 20 participants
n_per_group = 20

# effect size = 0.8
group_means = [0.0, 0.8]
group_sigmas = [1.0, 1.0]

n_groups = len(group_means)

# number of simulations
n_sims = 10000

# power level that we would like to reach
desired_power = 0.8

# initialise the power for the current sample size to a small value
current_power = 0.0

# keep iterating until desired power is obtained
while current_power < desired_power:

    # FIXME:
    current_power = calculate_power()
    
    print( "With {n:d} samples per group, power = {p:.3f}".format(
        n=n_per_group,
        p=current_power
    ))

    # increase the number of samples by one for the next iteration of the loop
    n_per_group += 1

NameError: name 'calculate_power' is not defined

We can see that we would reach the desired power with somewhere between 25 and 27 participants per group.

## Visualizing the sample size/power relationship

Sometimes we do not have a single level of power in mind at the moment but would like to see the relationship between sample size and power so that we can see the costs and benefits of a particular sample size.

First, perform the simulations across a fixed set of sample sizes.
Then, plot the power as a function of the sample size

## Visualizing power across varying sample and effect sizes

Sometimes we do not have a single effect size in mind but would like to see the relationship between effect size and power.

First, perform the simulations across a fixed set of effect sizes.
Then, plot the power as a function of the effect size