# Computing metrics with event-level experiment data

In this Lab, we'll walk through an end-to-end workflow for computing a series of metrics with data collected by both Optimizely and a third party during an Optimizely Full Stack experiment.

## The experiment

We'll use simulated data from the following "experiment" in this notebook: 

Attic & Button, a popular imaginary retailer of camera equipment and general electronics, has seen increased shipping times for some of its orders due to logistical difficulties imposed by the COVID-19 pandemic. As a result, customer support call volumes have increased.  In order to inform potential customers and cut down on customer support costs, the company's leadership has decided to add an informative banner to the [atticandbutton.com](http://atticandbutton.com) homepage.

In order to measure the impact this banner has on customer support volumes and decide which banner message is most effective, the team at Attic & Button have decided to run an experiment with the following variations:

<table>
    <tr>
        <td>
            <img src="img/control.png" alt="Control" style="width:100%; padding-left:0px">
        </td>
        <td>
            <img src="img/message_1.png" alt="Message #1" style="width:100%; padding-right:0px">
        </td>
        <td>
            <img src="img/message_2.png" alt="Message #2" style="width:100%; padding-right:0px">
        </td>
    </tr>
    <tr>
        <td style="background-color:white; text-align:center">
            "control"
        </td>
        <td style="background-color:white; text-align:center">
            "message_1"
        </td>
        <td style="background-color:white; text-align:center">
            "message_2"
        </td>
    </tr>
</table>

## The challenge

Attic & Button's call centers are managed by a third party.  This third party shares call data with Attic & Button periodically in a [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) file, making it difficult to track customer support metrics on Optimizely's [Experiment Results Page](https://app.optimizely.com/l/QQbfVyRFQYGq-J57P-3XoQ?previousView=VARIATIONS&variation=email_button&utm_campaign=copy).

In this notebook, we'll use Optimizely Enriched Event Data and our third-party call data to compute a variety of metrics for our experiment, including "Support calls per visitor" and "Total call duration per visitor". 

## Global parameters

The following global parameters are used to control the execution in this notebook.  These parameters may be overridden by setting environment variables prior to launching the notebook, for example:

```sh
export OPTIMIZELY_DATA_DIR=~/my_analysis_dir
```

In this block we check whether these parameters have been passed with environment variables and assign default values otherwise.  The default value for `OPTIMIZELY_API_TOKEN` is a read-only token for a demonstration Optimizely project.

In [None]:
import os
from getpass import getpass
from IPython.display import clear_output

# This notebook requires an Optimizely API token.
# The default token provided here is a read-only token associated with a demo Optimizely account
OPTIMIZELY_API_TOKEN = os.environ.get("OPTIMIZELY_API_TOKEN", "2:d6K8bPrDoTr_x4hiFCNVidcZk0YEPwcIHZk-IZb5sM3Q7RxRDafI")

# Uncomment the following block to enable manual API token entry
# if OPTIMIZELY_API_TOKEN is None:
#    OPTIMIZELY_API_TOKEN = getpass("Enter your Optimizely personal API access token:")

# Default path for reading and writing analysis data
OPTIMIZELY_DATA_DIR = os.environ.get("OPTIMIZELY_DATA_DIR", "./covid_test_data")

# Set environment variables
# These variables are used by other notebooks and shell scripts invoked
# in this notebook
%env OPTIMIZELY_DATA_DIR={OPTIMIZELY_DATA_DIR}
%env OPTIMIZELY_API_TOKEN={OPTIMIZELY_API_TOKEN}

clear_output()

## Download Optimizely Enriched Event data

This notebook relies (in part) on data downloaded from Optimizely's [Enriched Event Export Service](https://docs.developers.optimizely.com/optimizely-data/docs/enriched-events-export).

The default input data for this notebook can be found in the in `covid_test_data` directory.  

If you have the [oevents](https://github.com/optimizely/oevents) command line tool installed and accessible in your`PATH` environment variable, you may uncomment the following commands to re-download this data. Note that this will require `OPTIMIZELY_API_TOKEN` to be set to the default value specified above.

We'll start by downloading [decision](https://docs.developers.optimizely.com/optimizely-data/docs/enriched-events-data-specification#decisions-2) data collected during our experiment.  Each **decision** captures the moment a visitor was added to our experiment.

In [None]:
# Uncomment this line to re-download decision data for this experiment 
# Note: requires oevents to be install and accessible on your path

# !oevents load --type decisions --experiment 18786493712 --date 2020-09-14

Next we'll download [conversion](https://docs.developers.optimizely.com/optimizely-data/docs/enriched-events-data-specification#conversions-2) data collected during our experiment.  Each **conversion event** captures the moment a visitor took some action on our website, e.g. viewing our homepage, adding an item to their shopping cart, or making a purchase.

In [None]:
# Uncomment this line to re-download conversion data for this experiment 
# Note: requires oevents to be install and accessible on your path

# !oevents load --type events --date 2020-09-14

## Load Decision and Conversion Data into Spark Dataframes

We'll use [PySpark](https://spark.apache.org/docs/latest/api/python/index.html) to transform data in this notebook. We'll start by creating a new local Spark session.

In [None]:
from pyspark.sql import SparkSession

num_cores = 1
driver_ip = "127.0.0.1"
driver_memory_gb = 1
executor_memory_gb = 2

# Create a local Spark session
spark = SparkSession \
    .builder \
    .appName("Python Spark SQL") \
    .config(f"local[{num_cores}]") \
    .config("spark.sql.repl.eagerEval.enabled", True) \
    .config("spark.sql.repl.eagerEval.truncate", 120) \
    .config("spark.driver.bindAddress", driver_ip) \
    .config("spark.driver.host", driver_ip) \
    .config("spark.driver.memory", f"{driver_memory_gb}g") \
    .config("spark.executor.memory", f"{executor_memory_gb}g") \
    .getOrCreate()

Next we'll load our decision data into a Spark dataframe:

In [None]:
import os
from lib import util

decisions_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=decisions")

# load enriched decision data from disk into a new Spark dataframe
decisions = util.read_parquet_data_from_disk(
    spark_session=spark,
    data_path=decisions_dir,
    view_name="decisions"
)

Now we can write SQL-style queries against our `enriched_decisions` view.  Let's use a simple query to examine our data:

In [None]:
spark.sql("""
    SELECT
        *
    FROM
        decisions
    LIMIT 3
""")

Next we'll load conversion data:

In [None]:
# oevents downloads conversion data into the type=events subdirectory
conversions_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=events")

# load conversion data from disk into a new Spark dataframe
converions = util.read_parquet_data_from_disk(
    spark_session=spark,
    data_path=conversions_dir,
    view_name="events"
)

Let's take a look at our data:

In [None]:
spark.sql("""
    SELECT
        *
    FROM
        events
    LIMIT 3
""")

## Compute some useful intermediate experiment datasets

In this section, we'll compute three useful intermediate experiment datasets:

1. Enriched decisions - Optimizely [decision](https://docs.developers.optimizely.com/optimizely-data/docs/enriched-events-data-specification#decisions-2) data enriched with human-readable experiment and variation names.
2. Experiment Units - the individual units (usually website visitors or app users) that are exposed to a control or treatment in the course of a digital experiment.
3. Experiment Events - conversion events, such as a button click or a purchase, that were influenced by a digital experiment.

The following diagram illustrates how these datasets are used to compute _metric observations_ for our experiment:

![Transformations](img/transformations.png)

### Enriched decisions

First we'll use Optimizely's [Experiment API](https://library.optimizely.com/docs/api/app/v2/index.html#operation/get_experiment) to enrich our decision data with experiment and variation names.  This step makes it easier to build human-readable experiment reports with this data.

The code for enriching decision data can be found in the `enriching_decision_data.ipynb` notebook in this lab directory.

In [None]:
%run ./enriching_decision_data.ipynb

### Experiment Units

**Experiment units** are the individual units that are exposed to a control or treatment in the course of an online experiment.  In most online experiments, subjects are website visitors or app users. However, depending on your experiment design, treatments may also be applied to individual user sessions, service requests, search queries, etc. 

<table>
    <tr>
        <td>
            <img src="img/transformations_1.png" alt="Experiment Units" style="width:100%; padding-left:0px">
        </td>
        <td>
            <img src="img/tables_1.png" alt="Experiment Units" style="width:100%; padding-left:0px">
        </td>
    </tr>
</table>

In [None]:
experiment_units = spark.sql(f"""
    SELECT
        *
    FROM (
        SELECT
            *,
            RANK() OVER (PARTITION BY experiment_id, visitor_id ORDER BY timestamp ASC) AS rnk
        FROM
            enriched_decisions
    )
    WHERE
        rnk = 1
    ORDER BY timestamp ASC
""").drop("rnk")
experiment_units.createOrReplaceTempView("experiment_units")

Let's examine our experiment unit dataset:

In [None]:
spark.sql("""
    SELECT
        visitor_id,
        experiment_name,
        variation_name,
        timestamp
    FROM
        experiment_units
    LIMIT 3
""")

Let's count the number of visitors in each experiment variation:

In [None]:
spark.sql("""
    SELECT 
        experiment_name,
        variation_name,
        count(*) as unit_count
    FROM 
        experiment_units
    GROUP BY 
        experiment_name,
        variation_name
    ORDER BY
        experiment_name ASC,
        variation_name ASC
""")

### Experiment Events

An **experiment event** is an event, such as a button click or a purchase, that was influenced by an experiment.  We compute this view by isolating the conversion events triggered during a finite window of time (called the _attribution window_) after a visitor has been exposed to an experiment treatment.

<table>
    <tr>
        <td>
            <img src="img/transformations_2.png" alt="Experiment Units" style="width:100%; padding-left:0px">
        </td>
        <td>
            <img src="img/tables_2.png" alt="Experiment Units" style="width:100%; padding-left:0px">
        </td>
    </tr>
</table>

In [None]:
# Create the experiment_events view
experiment_events = spark.sql(f"""
    SELECT
        u.experiment_id,
        u.experiment_name,
        u.variation_id,
        u.variation_name,
        e.*
    FROM
        experiment_units u INNER JOIN events e ON u.visitor_id = e.visitor_id
    WHERE
        e.timestamp BETWEEN u.timestamp AND (u.timestamp + INTERVAL 48 HOURS)
""")
experiment_events.createOrReplaceTempView("experiment_events")

Let's examine our Experiment Events dataset:

In [None]:
spark.sql("""
    SELECT
        timestamp,
        visitor_id,
        experiment_name,
        variation_name,
        event_name,
        tags,
        revenue
    FROM
        experiment_events
    LIMIT 10
""")

As above, let's count the number of events that were influenced by each variation:

In [None]:
spark.sql(f"""
    SELECT
        experiment_name,
        variation_name,
        event_name,
        count(*) as event_count
    FROM
        experiment_events
    GROUP BY
        experiment_name,
        variation_name,
        event_name
    ORDER BY
        experiment_name ASC,
        variation_name ASC,
        event_name ASC
""")

## Compute metric observations

**Metric observations** map each **experiment unit** to a specific numerical outcome observed during an experiment.  For example, in order to measure purchase conversion rate associated with each variation in an experiment, we can map each visitor to a 0 or 1, depending on whether or not they'd made at least one purchase during the attribution window in our experiment.

Unlike **experiment units** and **experiment events**, which can be computed using simple transformations,  **metric observations** are metric-dependent and can be arbitrarily complex, depending on the outcome you're trying to measure.

<table>
    <tr>
        <td>
            <img src="img/transformations_3.png" alt="Experiment Units" style="width:100%; padding-left:0px">
        </td>
        <td>
            <img src="img/tables_3.png" alt="Experiment Units" style="width:100%; padding-left:0px">
        </td>
    </tr>
</table>

We're going to use a helper function, `compute_metric_observations`, to abstract away some of the redundant parts of this computation.  This function takes a set of "raw observations" as input, and

1. performs a `LEFT JOIN` with our experiment units in order to ensure that the resulting dataset contains an observation for every visitor in our experiment
2. (optionally) appends the resulting metric observations to a global `observations` dataset

This allows us to focus on the logic for aggregating experiment events into a numerical observation, which is the most interesting part of the process.

### Metric: Purchase conversion rate

In this query we measure for each visitor whether they made _at least one_ purchase. The resulting observation should be `1` if the visitor triggered the event in question during the _attribution window_ and `0` otherwise.  

Since _any_ visitor who triggered an appropriate experiment event should be counted, we can simply select a `1`. 

In [None]:
## Unique conversions on the "purchase" event.
raw_purchase_conversion_rate_obs = spark.sql(f"""
    SELECT
        visitor_id,
        1 as observation
    FROM
        experiment_events
    WHERE
        event_name = 'purchase'
    GROUP BY
        visitor_id
""")
raw_purchase_conversion_rate_obs.toPandas().head(5)

We'll use `util.compute_metric_observations` function to perform a left outer join between `experiment_units` and our newly-computed `purchase` conversion data.

In [None]:
observations = util.compute_metric_observations(
    "Purchase conversion rate",
    raw_purchase_conversion_rate_obs,
    experiment_units,
)

Let's take a look at our observations view:

In [None]:
observations.createOrReplaceTempView("observations")
spark.sql("""
    SELECT 
        metric_name,
        timestamp,
        visitor_id, 
        experiment_name, 
        variation_name, 
        observation 
    FROM 
        observations
    ORDER BY
        timestamp ASC
    LIMIT 10
""")

Metric observations can be used to compute a variety of useful statistics.  Let's compute the value of our `purchase` conversion rate metric for all of the visitors in our experiment:

In [None]:
spark.sql("""
    SELECT
        metric_name,
        experiment_name,
        count(1) as unit_count,
        sum(observation),
        sum(observation) / (1.0 * count(1)) as metric_value
    FROM
        observations
    WHERE
        metric_name = "Purchase conversion rate"
    GROUP BY
        metric_name,
        experiment_name
""")

Now let's compute the `purchase` conversion rate broken down by experiment variation:

In [None]:
spark.sql("""
    SELECT
        metric_name,
        experiment_name,
        variation_name,
        count(1) as unit_count,
        sum(observation),
        sum(observation) / (1.0 * count(1)) as metric_value
    FROM
        observations
    WHERE
        metric_name = "Purchase conversion rate"
    GROUP BY
        metric_name,
        experiment_name,
        variation_name
""")

### Metric: Product detail page views per visitor

In this query we count the number of product detail page views per visitor

In [None]:
## Unique conversions on the "add_to_cart" event.
observations = util.compute_metric_observations(
    "Product detail page views per visitor",
    raw_observations_df = spark.sql("""
        SELECT
            visitor_id,
            count(1) as observation
        FROM
            experiment_events
        WHERE
            event_name = "detail_page_view"
        GROUP BY
            visitor_id
    """),
    experiment_units_df = experiment_units,
    append_to=observations
)

We can inspect our observations by counting the units and summing up the observations we've computed for each experiment in our dataset:

In [None]:
spark.sql("""
    SELECT 
        metric_name, 
        timestamp,
        experiment_name, 
        variation_id, 
        visitor_id, 
        observation 
    FROM 
        observations
    WHERE
        metric_name = "Product detail page views per visitor"
    LIMIT 5
""")

### Metric: Revenue from electronics purchases

In this query we compute the total revenue associated with electronics purchases made by our experiment subjects.

In [None]:
observations = util.compute_metric_observations(
    "Electronics revenue per visitor",
    raw_observations_df = spark.sql("""
        SELECT
            visitor_id,
            sum(revenue) as observation
        FROM 
            experiment_events
            LATERAL VIEW explode(tags) t
        WHERE
            t.key = "category" AND 
            t.value = "electronics" AND
            event_name = "purchase"
        GROUP BY
            visitor_id
    """),
    experiment_units_df = experiment_units,
    append_to=observations
)

Again, let's examine our observations:

In [None]:
spark.sql("""
    SELECT 
        metric_name, 
        timestamp,
        experiment_name, 
        variation_id, 
        visitor_id, 
        observation 
    FROM 
        observations
    WHERE
        metric_name = "Electronics revenue per visitor" AND
        observation > 0
    LIMIT 5
""")

### Metric: Call center volume

We can use the same techniques to compute experiment metric using "external" data not collected by Optimizely.  We'll demonstrate by loading a CSV customer support call records.

We'll start by reading in our call center data:

In [None]:
# Read call center logs CSV into a pandas dataframe
df = pd.read_csv("covid_test_data/call_data.csv")

# Display a sample of our call record data
df.head(5)

Now let's make sure our call center data schema is compatible with the transformations we want to perform.

In [None]:
# Convert "call start" timestamp strings to datetime objects
df["timestamp"] = pd.to_datetime(df.call_start)

# Rename the "user_id" column to "visitor_id" to match our decision schema
df = df.rename(columns={"user_id" : "visitor_id"})

# Convert pandas to spark dataframe
call_records = spark.createDataFrame(df)

# Create a temporary view so that we can query using SQL
call_records.createOrReplaceTempView("call_records")

# Display a sample of our call record data
spark.sql("SELECT * FROM call_records LIMIT 5")

Now let's transform our call center logs into "experiment calls" using the attribution logic we used above to compute "experiment events":

In [None]:
# Create the experiment_calls view
experiment_calls = spark.sql(f"""
    SELECT
        u.experiment_id,
        u.experiment_name,
        u.variation_id,
        u.variation_name,
        e.*
    FROM
        experiment_units u INNER JOIN call_records e ON u.visitor_id = e.visitor_id
    WHERE
        e.timestamp BETWEEN u.timestamp AND (u.timestamp + INTERVAL 48 HOURS)
""")
experiment_calls.createOrReplaceTempView("experiment_calls")

Now we can compute metric observations for call center calls and duration!

In [None]:
# Count the number of support phone calls per visitor
observations = util.compute_metric_observations(
    "Customer support calls per visitor",
    raw_observations_df = spark.sql("""
        SELECT
            visitor_id,
            count(1) as observation
        FROM 
            experiment_calls
        GROUP BY
            visitor_id
    """),
    experiment_units_df = experiment_units,
    append_to=observations
)

# Count the number of support phone calls per visitor
observations = util.compute_metric_observations(
    "Total customer support minutes per visitor",
    raw_observations_df = spark.sql("""
        SELECT
            visitor_id,
            sum(call_duration_min) as observation
        FROM 
            experiment_calls
        GROUP BY
            visitor_id
    """),
    experiment_units_df = experiment_units,
    append_to=observations
)

## Computing metric values for experiment cohorts

We can slice and dice our metric observation data to compute metric values for different experiment cohorts.  Here are some examples:

### Computing metric values per variation

Let's start by computing metric values broken down by experiment variation.

In [None]:
# Compute metric values broken down by experiment variation
spark.sql("""
    SELECT
        metric_name,
        experiment_name,
        variation_name,
        count(1) as unit_count,
        sum(observation),
        sum(observation) / (1.0 * count(1)) as metric_value
    FROM
        observations
    GROUP BY
        metric_name,
        experiment_name,
        variation_name
    ORDER BY
        metric_name,
        experiment_name,
        variation_name
""")

It looks like the average number of customer support calls per visitor and the average call duration are both much lower in the cohorts that saw one of our informational banner!

### Computing metric values for a visitor segment

We can filter metric observations by visitor attributes in order to compute metric values for a particular segment.

In [None]:
# Compute metric values broken down by customer segment
spark.sql("""
    SELECT
        metric_name,
        experiment_name,
        variation_name,
        attrs.value as browser,
        count(1) as unit_count,
        sum(observation),
        sum(observation) / (1.0 * count(1)) as metric_value
    FROM
        observations
        LATERAL VIEW explode(attributes) AS attrs
    WHERE
        attrs.name = "browser"
    GROUP BY
        metric_name,
        experiment_name,
        variation_name,
        attrs.value
    ORDER BY
        metric_name,
        experiment_name,
        variation_name,
        attrs.value
""")

## Computing sequential statistics with Optimizely's Stats Services

According to the metric data above, visitors who saw either of our informational banners during our experiment were less likely call support.  How confident can we be that the difference in call rates can be attributed to our banner, as opposed to statistical noise?

We're working on launching a set of Stats Services that can be used to perform sequential hypothesis testing on metric observation data.  You can learn more about these services and request early access [here](optimizely.com/solutions/data-teams).

## Writing our datasets to disk

We'll write our experiment units, experiment events, and metric observations datasets to disk so that they may be used for other analysis tasks.

In [None]:
from lib import util

experiment_units_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=experiment_units")
util.write_parquet_data_to_disk(experiment_units, experiment_units_dir, partition_by="experiment_id")

experiment_events_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=experiment_events")
util.write_parquet_data_to_disk(experiment_events, experiment_events_dir, partition_by=["experiment_id", "event_name"])

metric_observations_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=metric_observations")
util.write_parquet_data_to_disk(observations, metric_observations_dir, partition_by=["experiment_id", "metric_name"])

## How to run this notebook

This notebook lives in the [Optimizely Labs](http://github.com/optimizely/labs) repository.  You can download it and everything you need to run it by doing one of the following
- Downloading a zipped copy of this Lab directory on the [Optimizely Labs page](https://www.optimizely.com/labs/computing-experiment-subjects/)
- Downloading a [zipped copy of the Optimizely Labs repository](https://github.com/optimizely/labs/archive/master.zip) from Github
- Cloning the [Github respository](http://github.com/optimizely/labs)

Once you've downloaded this Lab directory (on its own, or as part of the [Optimizely Labs](http://github.com/optimizely/labs) repository), follow the instructions in the `README.md` file for this Lab.