# Experimental Units

In this notebook, I'll try to implement `choose_color_assignment` without considering experimental units.

In [1]:
import random
from utils.pretty import pp
from utils.simulate import n_different_users
from utils.simulate import same_user_n_times

## Randomly assigning, without experimental units

I could try using `random.choice` to implement `choose_color_assignment`.

Here are the rules I'll consider:

* ① Given a user_id, return the string of the color to show
* ② The same user_id is assigned to the same color
* ③ Different user_ids are randomly assigned
* ④ The proportion of user_ids that see red and blue is roughly 50-50

I'm going to reorder this slightly and save ② for last.

In [2]:
def bad_choose_color_assignment(user_id):
    return random.choice(['red', 'blue'])

### ① Given a user_id, return the string of the color to show ✅

In [3]:
bad_choose_color_assignment(user_id=1)

'red'

### ③ and ④, different user_ids are randomly assigned and the proportion is roughly 50-50 ✅ 

If I look at 10000 different users, it's around 50% red and half blue, like I want. 

In [4]:
n_different_users(bad_choose_color_assignment, n=10000).groupby('color').count()

Unnamed: 0_level_0,user_id
color,Unnamed: 1_level_1
blue,5088
red,4912


### ② The same user_id is assigned to the same color ❌

I'm not showing the same user_id the same color.

In [5]:
pp(
    same_user_n_times(bad_choose_color_assignment, n=10)
)

Unnamed: 0,user_id,color
0,1,red
1,1,blue
2,1,red
3,1,red
4,1,blue
5,1,red
6,1,blue
7,1,red
8,1,blue
9,1,red


## What this means in practice

In this case, the experimental_unit *depends on where we call `bad_choose_color_assignment`*. This can have unintended consequences. If we called `bad_choose_color_assignment` once per page request, a user could get different assignments each time they visit a page! That can be a bad user experience and impact metrics.


### Different Experimental Units

In these notebooks, I'll use `user_id` as my experimental unit. There are others, like `user_id+day`, which means every day we'll assign a user something independently. Even `page request` could be a valid experimental unit for another experiment. 

### Experimental units and metrics

In either case, I'd want to consider the user experience and the impact on metrics. Page request could work, but I probably don't want to use it for a UX change. I'll also need to be able to compute your metric within that page request, for example, a timing of a call. I couldn't use something like click through rate, since the view might get one assignment and the click might get another.

## Storing assignments

One way to implement assignment with experimental units is to use `random.choice` and store the result.

In [6]:
color_assignments = {}  # user_id -> color

def choose_color_assignment(user_id):
    if user_id in color_assignments:
        return color_assignments[user_id]
    else:
        color = random.choice(['red', 'blue'])
        color_assignments[user_id] = color
        return color

### ② The same user_id is assigned to the same color ✅

In [7]:
pp(
    same_user_n_times(choose_color_assignment, n=10)
)

Unnamed: 0,user_id,color
0,1,red
1,1,red
2,1,red
3,1,red
4,1,red
5,1,red
6,1,red
7,1,red
8,1,red
9,1,red


# Summary

In this section, I introduced experimental units by showing a `choose_color_assignment` implementation that doesn't consider the experimental unit.


# [Next: Deterministic Assignment](2.DeterministicAssignment.ipynb)

Storing the assignment isn't the only way. Next, I'll implement deterministic assignment.

# TOC
- **[0. Introduction](0.Introduction.ipynb)**: What a good `choose_color_assignment` function looks like.
- **[1. Experimental Units](1.ExperimentalUnits.ipynb)**: What happens when I don't pay attention to experimental units.
- **[2. Deterministic Assignment](2.DeterministicAssignment.ipynb)**: What it looks like to deterministically assign
- **[3. Scaling](3.Scaling.ipynb)**: How not to run two experiments at the same time.
- **[4. Rollout](4.Rollout.ipynb)**: How to gradually show users a new experiment.