## Randomization

In the previous chapter, we saw how randomization eliminates selection bias. Let's explain what we mean by randomization, describe several ways we might want to randomly assign treatments, and discuss the components *other than* the assignment that can be randomized.

Randomization refers to using "a known, well-understood probabilistic scheme" to assign treatments to units (Oehlert, 2010).

### Simple Random Assignment

With simple random assignment, every unit has the same probability of being assigned to a particular treatment group. The probability can be anything greater than zero and less than one. This will *approximately* determine the number of units in each group. For example, assuming a single treatment group and a single control group, if the probability is 0.75, about 75% will be assigned to the treatment group.

Let's imagine we have 10 units to which we assign a treatment with 0.5 probability. Will our groups be balanced? That is, will we have 5 units in the treatment group and 5 units in the control group? Let's find out.

In [1]:
import numpy as np

n, p = 10, 0.5
np.random.binomial(n, p)

4

This counts the number of successes&mdash;think of "success" as being assigned to the treatment group&mdash;in 10 *independent* trials, where success occurs 50% of the time.

Each time you run the cell above, you'll get a different result&mdash;it's not always 5! This is a drawback of simple random assignment.

>[Y]ou could flip a coin to assign each of 10 [units] to the treatment condition, but there is only a 24.6% chance of ending up with exactly 5 [units] in treatment and 5 in control (Gerber and Green, 2012)

So that others may reproduce our assignments, we can use a random seed. This is highly recommended, though, in practice, we won't use `np.random.binomial()`. (Note: I'll always use `42` as the seed.)

In [2]:
np.random.seed(42)
np.random.binomial(n, p)

4