# Experiment Assignment on the Web: Introduction

This is supplemental material for my PyCon talk. It has code samples. It also covers some things I really wanted to talk about but didn't have time for.

In these notebooks, I'll implement `choose_color_assignment`, a function that does experiment assignment. In this notebook, I'll introduce the goals of experiment assignment, and show what a good function looks like.

## Table of Contents
- **[0. Introduction](0.Introduction.ipynb)**: What a good `choose_color_assignment` function looks like.
- **[1. Experimental Units](1.ExperimentalUnits.ipynb)**: What happens when I don't pay attention to experimental units.
- **[2. Deterministic Assignment](2.DeterministicAssignment.ipynb)**: What it looks like to deterministically assign
- **[3. Scaling](3.Scaling.ipynb)**: How not to run two experiments at the same time (salts)
- **[4. Rollout](4.Rollout.ipynb)**: How to gradually show users a new experiment (assignment groups)

In [1]:
# The utils folder has helper code
from utils import spoilers
from utils.pretty import pp
from utils.simulate import same_user_n_times
from utils.simulate import n_different_users

## Goal of Experiment Assignment

In general, experiment assignment's main purpose is to randomly assign an experimental unit to a variant.

The function `choose_color_assignment` will assign the experimental units, `user_ids`, so half of the users see 'red' and the other half see 'blue'.

<img width=33% src="files/abtest.png">

## Spoilers

This is what I think a good `choose_color_assignment` should do:

* ① Given a user_id, return the string of the color to show
* ② The same user_id is assigned to the same color
* ③ Different user_ids are randomly assigned
* ④ The proportion of user_ids that see red and blue is roughly 50-50

### ① Given a user_id, return the string of the color to show

In [2]:
# This says we should show the 'blue' variant to user_id 1.
spoilers.choose_color_assignment(user_id=1)

'red'

### ② The same user_id is assigned to the same color

In [3]:
# If I try to assign this user again, they should see blue again.
spoilers.choose_color_assignment(user_id=1)

'red'

In [4]:
# If the user visits 10 times, each time they should see blue.
pp(
    same_user_n_times(spoilers.choose_color_assignment, n=10)
)

Unnamed: 0,user_id,color
0,1,red
1,1,red
2,1,red
3,1,red
4,1,red
5,1,red
6,1,red
7,1,red
8,1,red
9,1,red


In [5]:
# Even the 10000th time, they'll be assigned to the same thing!
same_user_n_times(spoilers.choose_color_assignment, n=10000).groupby('color').count()

Unnamed: 0_level_0,user_id
color,Unnamed: 1_level_1
red,10000


### ③ Different user_ids are randomly assigned

If I look at different user_ids, they'll be assigned to a variant in a [statistically random](https://en.wikipedia.org/wiki/Statistical_randomness) way. (I'm just going to hint at it being random by showing that the assignments look kind of random.)

In [6]:
pp(
    n_different_users(spoilers.choose_color_assignment, n=10)
)

Unnamed: 0,user_id,color
0,0,red
1,1,red
2,2,blue
3,3,red
4,4,blue
5,5,blue
6,6,blue
7,7,red
8,8,blue
9,9,blue


### ④ The proportion of user_ids that see red and blue is roughly 50-50

In [7]:
n_different_users(spoilers.choose_color_assignment, n=10000).groupby('color').count()

Unnamed: 0_level_0,user_id
color,Unnamed: 1_level_1
blue,4954
red,5046


## Contents

- **[0. Introduction](0.Introduction.ipynb)**: What a good `choose_color_assignment` function looks like.
- **[1. Experimental Units](1.ExperimentalUnits.ipynb)**: What happens when I don't pay attention to experimental units.
- **[2. Deterministic Assignment](2.DeterministicAssignment.ipynb)**: What it looks like to deterministically assign
- **[3. Scaling](3.Scaling.ipynb)**: How not to run two experiments at the same time (salts)
- **[4. Rollout](4.Rollout.ipynb)**: How to gradually show users a new experiment (assignment groups)

# [Next : 1. Experimental Units](1.ExperimentalUnits.ipynb)

Next, I'll try to implement experiment assignment.

# Appendix

### Assumptions

 - I have enough users!

 - I can switch between variants on the fly. 

 
### Glossary

Some words I throw around:

 - **Variant**: This is a change controlled by an experiment. In this example, "red" is a variant of the color experiment.

 - **Experimental unit**: This is what I want to assign to a variant. I'll use some identifier, like *user_id*
 
### Read more

 - *Designing and Deploying Online Field Experiments* (Bakshy et al). Open sourced! http://facebook.github.io/planout/
 - *Controlled experiments on the web: survey and practical guide* (Kohavi et al)
 - *Overlapping Experiment Infrastructure: More, Better, Faster Experimentation* (Tang et al)
 - *From Infrastructure to Culture: A/B Testing Challenges in Large Scale Social Networks* (Xu et al)