Module 2, Lab 5 - Power 2 - Sample Size Planning Illustration
=============================================================

In this lab, we will practice using the `tt_ind_solve_power` from the Python `statsmodels.stats.power` package for determining the minimum necessary sample size for a two-sample *t*-test design. I illustrate a real-life iterative project planning sequence so you can see how power planning can be integrated into a data-science research project.

First, load the required software:

In [None]:
from statsmodels.stats.power import tt_ind_solve_power

You wish to compare groups at two different locations in your
organization to see if either group is more satisfied with their working
conditions. You will be comparing the groups with a *t*-test, and you
care deeply about estimating the effect, even if it is small. Data will
be challenging to get, however, as you will have to get managers to ask
employees to return surveys. You need to determine the minimum necessary
sample size to get adequate power.

Usually, we start with our dream scenario. Let's ask for 90% power to
detect a very small effect size (*d* = .10). What sample size would be
required?

In [None]:
tt_ind_solve_power(effect_size=0.1, nobs1 = None, alpha=0.05, power=0.9, ratio=1, alternative='two-sided')

We see here that we need 2103 people per group, or 4206 people in total.
Knowing the size of the organization, you know anything that size is
likely out of the question. You might be able to get away, you think
with collecting data from 500 participants without imposing too much on
team supervisors. So, you try again, this time with a more realistic 80%
power and a mid-range-small effect of *d* = .25.

In [None]:
tt_ind_solve_power(effect_size=0.25, nobs1 = None, alpha=0.05, power=0.8, ratio=1, alternative='two-sided')

By pure happenstance, you get 253 per group (always round up to ensure
sample size is adequate). You take this proposal to collect data from
500 employees to your supervisor; after some discussion, you are told
that they will try to push for a large sample (on the basis of your
request), but they've decided 400 is the maximum they are likely to be
able to collect.

Now, you change your strategy. 400 is close to 500, so it's likely to be
similar. You now leave out effect size and input *n* = 200 (since it's
per group) and a request for 80% power:

In [None]:
tt_ind_solve_power(effect_size=None, nobs1 = 200, alpha=0.05, power=0.8, ratio=1, alternative='two-sided')

You will now have 80% power to detect effects as small as *d* = .28,
which is still a mid-range small effect.

Before you tell everyone that will still work with this suggested sample
size, you run a loop to estimate power at that sample size for various
effect sizes (e.g., power would suffer if *d* were lower, but would it
be *that* terrible if *d* were, say, .10?)

In [None]:
d_values = [x/100.0 for x in range(5,55,5)] # Need range to 55 since Python is zero based indexing

powers = [tt_ind_solve_power(effect_size=d, nobs1 = 200, alpha=0.05, power=None, ratio=1, alternative='two-sided')
            for d in d_values]
powers

Looking at this chart, we see that power really starts to drop off
around *d* = .20, hitting 51%. You discuss this with your team; they
conclude they are ok with a 50% chance of declaring "no difference" if
effect is *that* small. The study is run with 400 people and an
informative result is produced.

Epilogue
========

At the very end, you hear that the original proposal had been to collect
40 responses, 20 from each site. You smile to yourself, considering how
your power analysis likely saved the project. You run a power analysis
just to see how bad the situation would be:

In [None]:
tt_ind_solve_power(effect_size=None, nobs1 = 20, alpha=0.05, power=0.8, ratio=1, alternative='two-sided')

You see here that the smallest effect size for which you would have good
power is well into the 'large' range. What if the effects were small?
What are the odds the study would even be able to pick them up? You
consider the scenario of *d* = .25:

In [None]:
tt_ind_solve_power(effect_size=.25, nobs1 = 20, alpha=0.05, power=None, ratio=1, alternative='two-sided')

This study would have 12% power. Yikes. It's a good thing you performed
a power analysis.