# Making your own data

### Overview
- How PlanOut logs data
- Flow for loading and analyzing data
- Putting it all together: simulated web app and example analysis

In [46]:
%load_ext rpy2.ipython
from planout.ops.random import *
from planout.experiment import SimpleExperiment
import pandas as pd
import json
import random

The rpy2.ipython extension is already loaded. To reload it, use:
  %reload_ext rpy2.ipython


In [59]:
%%R
library(dplyr)

# Logging

## Log files

Create a new experiment and get a randomized assignment

In [25]:
class LoggedExperiment(SimpleExperiment):
    def assign(self, params, userid):
        params.x = UniformChoice(
            choices=["What's on your mind?", "Say something."],
            unit=userid
        )
        params.y = BernoulliTrial(p=0.5, unit=userid)

print LoggedExperiment(userid=8).get('x')

Say something.


Then open your terminal, navigate to the directory this notebook is in, and type:

```
> tail -f LoggedExperiment.log
```

You can now see how data is logged to your experiment as its run.

### Exposure logging

- Parameter assignments are logged automatically the first time you retrieve a parameter
- Logger can be configured to do caching, write to databases, etc.

In [26]:
e = LoggedExperiment(userid=7)
print e.get('x')
print e.get('y')

What's on your mind?
1


### Manual exposure logging

Calling `log_exposure()` will force PlanOut to log an exposure event. You can optionally pass in additional data.

In [27]:
e.log_exposure()
e.log_exposure({'endpoint': 'home.py'})

### Event logging

You can also log arbitrary events. The first argument to `log_event()` is a required parameter that specifies the event type.

In [28]:
e.log_event('post_status_update')
e.log_event('post_status_update', {'type': 'photo'})

## Custom logging

- Logging method is configurable
- Can write to arbitrary loggers

In [48]:
class CustomLoggedExperiment(SimpleExperiment):
    def assign(self, params, userid):
        params.x = UniformChoice(
            choices=["What's on your mind?", "Say something."],
            unit=userid
        )
        params.y = BernoulliTrial(p=0.5, unit=userid)
    def log(self, data):
        print json.dumps(data)
        
e = CustomLoggedExperiment(userid=7)
print e.get('x')

{"inputs": {"userid": 7}, "name": "CustomLoggedExperiment", "params": {"y": 1, "x": "What's on your mind?"}, "time": 1431931033, "salt": "CustomLoggedExperiment", "event": "exposure"}
What's on your mind?


# Putting it all together

We simulate the components of a PlanOut-driven website and show how data analysis would work in conjunction with the data generated from the simulation.

This hypothetical experiment looks at the effect of sorting a music album's songs by popularity (instead of say track number) on a Web-based music store.

Our website simulation consists of four main parts:
 * Code to render the web page (which uses PlanOut to decide how to display items)
 * Code to handle item purchases (this logs the "conversion" event)
 * Code to simulate the process of users' purchase decision-making
 * A loop that simulates many users viewing many albums

In [29]:
class MusicExperiment(SimpleExperiment):
    def assign(self, params, userid, albumid):
        params.sort_by_rating = BernoulliTrial(p=0.2, unit=[userid, albumid])

In [30]:
def get_price(albumid):
    "look up the price of an album"
    # this would realistically hook into a database
    return 11.99

#### Rendering the web page

In [31]:
def render_webpage(userid, albumid):
    'simulated web page rendering function'
    
    # get experiment for the given user / album pair.
    e = MusicExperiment(userid=userid, albumid=albumid)
    
    # use log_exposure() so that we can also record the price
    e.log_exposure({'price': get_price(albumid)})
    
    # use a default value with get() in production settings, in case
    # your experimentation system goes down
    if e.get('sort_by_rating', False):
        songs = "some sorted songs" # this would sort the songs by rating
    else:
        songs = "some non-sorted songs"
    
    html = "some HTML code involving %s" % songs  # most valid html ever.
    # render html

#### Logging outcomes

In [32]:
def handle_purchase(userid, albumid):
    'handles purchase of an album'
    e = MusicExperiment(userid=userid, albumid=albumid)
    e.log_event('purchase', {'price': get_price(albumid)})
    # start album download

### Generative model of user decision making

In [39]:
def simulate_user_decision(userid, albumid):
    'simulate user experience'
    # This function should be thought of as simulating a users' decision-making
    # process for the given stimulus - and so we don't actually want to do any
    # logging here.
    e = MusicExperiment(userid=userid, albumid=albumid)
    e.set_auto_exposure_logging(False)  # turn off auto-logging
    
    # users with sorted songs have a higher purchase rate
    if e.get('sort_by_rating'):
        prob_purchase = 0.15
    else:
        prob_purchase = 0.10
    
    # make purchase with probability prob_purchase
    return random.random() < prob_purchase

### Running the simulation

In [40]:
# We then simulate 500 users' visitation to 20 albums, and their decision to purchase
random.seed(0)
for u in xrange(500):
    for a in xrange(20):
        render_webpage(u, a)
        if simulate_user_decision(u, a):
            handle_purchase(u, a)

## Analyzing your experiment

### Standard analysis procedure
- Data is logged to JSON.
- Use a script to flatten file into tabular format
- 

In [None]:
# stolen from http://stackoverflow.com/questions/23019119/converting-multilevel-nested-dictionaries-to-pandas-dataframe
from collections import OrderedDict
def flatten(d):
    "Flatten an OrderedDict object"
    result = OrderedDict()
    for k, v in d.items():
        if isinstance(v, dict):
            result.update(flatten(v))
        else:
            result[k] = v
    return result

In [54]:
def log2csv(filename):
    raw_log_data = [json.loads(i) for i in open(filename)]
    log_data = pd.DataFrame.from_dict([flatten(i) for i in raw_log_data])
    log_data.to_csv(filename[:-4] + '.csv', index=False)

It's preferable to deal with the data as a flat set of columns. We use this handy-dandy function Eytan found on stackoverflow to flatten dictionaries.

Here is what the flattened dataframe looks like:

In [55]:
log2csv('MusicExperiment.log')

In [61]:
%%R
d <- read.csv('MusicExperiment.csv')
print(d %>% sample_n(10))

      albumid    event            name price            salt sort_by_rating
7566       14 exposure MusicExperiment 11.99 MusicExperiment              1
2449       13 exposure MusicExperiment 11.99 MusicExperiment              0
4493        1 exposure MusicExperiment 11.99 MusicExperiment              0
10695       1 exposure MusicExperiment 11.99 MusicExperiment              1
311         3 exposure MusicExperiment 11.99 MusicExperiment              0
9624       19 exposure MusicExperiment 11.99 MusicExperiment              0
1585        9 exposure MusicExperiment 11.99 MusicExperiment              0
1065        4 exposure MusicExperiment 11.99 MusicExperiment              0
9247       18 exposure MusicExperiment 11.99 MusicExperiment              0
9925       10 exposure MusicExperiment 11.99 MusicExperiment              1
            time userid
7566  1431930243    340
2449  1431930242    109
4493  1431930243    202
10695 1431930244    481
311   1431930242     14
9624  1431930243    

In [64]:
%%R
d %>%
  group_by(event) %>%
  summarise(n=n())

Source: local data frame [2 x 2]

     event     n
1 exposure 10000
2 purchase  1123


### Joining exposure data with event data

We first extract all user-album pairs that were exposed to an experiemntal treatment, and their parameter assignments.

In [16]:
all_exposures = log_data[log_data.event=='exposure']
unique_exposures = all_exposures[['userid','albumid','sort_by_rating']].drop_duplicates()

Tabulating the users' assignments, we find that the assignment probabilities correspond to the design at the beginning of this notebook.

In [17]:
unique_exposures[['userid','sort_by_rating']].groupby('sort_by_rating').agg(len)

Unnamed: 0_level_0,userid
sort_by_rating,Unnamed: 1_level_1
0,8001
1,1999


Now we can merge with the conversion data.

In [18]:
conversions = log_data[log_data.event=='purchase'][['userid', 'albumid','price']]
df = pd.merge(unique_exposures, conversions, on=['userid', 'albumid'], how='left')
df['purchased'] = df.price.notnull()
df['revenue'] = df.purchased * df.price.fillna(0)

Here is a sample of the merged rows. Most rows contain missing values for price, because the user didn't purchase the item.

In [19]:
df[:5]

Unnamed: 0,userid,albumid,sort_by_rating,price,purchased,revenue
0,0,0,0,,False,0
1,0,1,0,,False,0
2,0,2,0,,False,0
3,0,3,0,,False,0
4,0,4,1,,False,0


Restricted to those who bought something...

In [20]:
df[df.price > 0][:5]

Unnamed: 0,userid,albumid,sort_by_rating,price,purchased,revenue
35,1,15,0,11.99,True,11.99
40,2,0,0,11.99,True,11.99
52,2,12,0,11.99,True,11.99
56,2,16,1,11.99,True,11.99
75,3,15,0,11.99,True,11.99


### Analyzing the experimental results

In [21]:
df.groupby('sort_by_rating')[['purchased', 'price', 'revenue']].agg(mean)

Unnamed: 0_level_0,purchased,price,revenue
sort_by_rating,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,0.103362,11.99,1.239311
1,0.145573,11.99,1.745418


If you were actually analyzing the experiment you would want to compute confidence intervals.