## Statsmodel Forecast AB Testing

A/B  Testing is one method of models against each other.  This demonstration will show how to use the Wallaroo pipeline step `add_random_split` and `replace_with_random_split` to randomly submit inference input data into control and challenger models.

## Prerequisites

* A Wallaroo instance version 2023.2.1 or greater.

## References

* [Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Python Models](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-model-uploads/wallaroo-sdk-model-upload-python/)
* [Wallaroo SDK Essentials Guide: Pipeline Management](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-pipelines/wallaroo-sdk-essentials-pipeline/)
* [Wallaroo SDK Essentials: Inference Guide: Parallel Inferences](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-inferences/#parallel-inferences)

## A/B Testing

### Import Libraries

The first step is to import the libraries that we will need.

In [58]:
import json
import os
import datetime

import wallaroo
from wallaroo.object import EntityNotFoundError
from wallaroo.framework import Framework

# used to display dataframe information without truncating
from IPython.display import display
import pandas as pd
import numpy as np

from resources import simdb
from resources import util

pd.set_option('display.max_colwidth', None)

In [59]:
display(wallaroo.__version__)

'2023.2.1'

### Initialize connection

Start a connect to the Wallaroo instance and save the connection into the variable `wl`.

In [60]:
# Login through local Wallaroo instance

wl = wallaroo.Client()

wallarooPrefix = "doc-test."
wallarooSuffix = "wallaroocommunity.ninja"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                    auth_type="sso")

### Set Configurations

The following will set the workspace, model name, and pipeline that will be used for this example.  If the workspace or pipeline already exist, then they will assigned for use in this example.  If they do not exist, they will be created based on the names listed below.

Workspace names must be unique.  To allow this tutorial to run in the same Wallaroo instance for multiple users, the `suffix` variable is generated from a random set of 4 ASCII characters.  To use the same workspace across the tutorial notebooks, hard code `suffix` and verify the workspace name created is is unique across the Wallaroo instance.

In [61]:
# used for unique connection names

suffix='-jch2'

workspace_name = f'forecast-model-workshop{suffix}'

pipeline_name = 'forecast-workshop-pipeline'

### Set the Workspace and Pipeline

The workspace will be either used or created if it does not exist, along with the pipeline.

In [62]:
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline

workspace = get_workspace(workspace_name)

wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)

### Upload Model

The Python model created in "Forecast and Parallel Infer with Statsmodel: Model Creation" will now be uploaded.  Note that the Framework and runtime are set to `python`.

In [63]:
# upload three models:  the control and two challengers

control_model_name = 'forecast-control-model'
control_model_file = './forecast_standard.py'

challenger01_model_name = 'forecast-challenger01-model'
challenger01_model_file = './forecast_alternate01.py'

challenger02_model_name = 'forecast-challenger02-model'
challenger02_model_file = './forecast_alternate02.py'

# upload the models

control_model = wl.upload_model(control_model_name, control_model_file, Framework.PYTHON).configure(runtime="python")

challenger_model_01 = wl.upload_model(challenger01_model_name, challenger01_model_file, Framework.PYTHON).configure(runtime="python")

challenger_model_02 = wl.upload_model(challenger02_model_name, challenger02_model_file, Framework.PYTHON).configure(runtime="python")


### Deploy the Pipeline

We will now add the uploaded model as a step for the pipeline, then deploy it.  The pipeline configuration will allow for multiple replicas of the pipeline to be deployed and spooled up in the cluster.  Each pipeline replica will use 0.25 cpu and 512 Gi RAM.

In [64]:
# Set the deployment to allow for additional engines to run

pipeline.add_model_step(control_model)
pipeline.deploy()

0,1
name,forecast-workshop-pipeline
created,2023-07-26 19:38:56.059951+00:00
last_updated,2023-07-26 20:46:14.631514+00:00
deployed,True
tags,
versions,"4b529632-8f30-4ae6-a588-8bca3971f20f, 659821ab-d0ad-40c2-984f-5052757c0782, 8dcf0175-07a3-4df3-8289-7b950a1de32b"
steps,forecast-challenger02-model


### Run Inference
For this example, we will forecast bike rentals by looking back one month from "today" which will be set as 2011-02-22.  The data from 2011-01-23 to 2011-01-27 (the 5 days starting from one month back) are used to generate a forecast for what bike sales will be over the next week from "today", which will be 2011-02-23 to 2011-03-01.

In [65]:
inferencedata = json.load(open("./data/testdata_dict.json"))

results = pipeline.infer(inferencedata)

display(results)

[{'forecast': [1764, 1749, 1743, 1741, 1740, 1740, 1740]}]

In [66]:
pipeline.replace_with_random_split(0, 
                                   [(1, control_model), 
                                    (1, challenger_model_01), 
                                    (1, challenger_model_02)], 
                                    "session_id"
                                    )
pipeline.deploy()

0,1
name,forecast-workshop-pipeline
created,2023-07-26 19:38:56.059951+00:00
last_updated,2023-07-26 20:46:28.006211+00:00
deployed,True
tags,
versions,"5bf788b1-fa77-4fdb-a837-409af9c19e2d, 4b529632-8f30-4ae6-a588-8bca3971f20f, 659821ab-d0ad-40c2-984f-5052757c0782, 8dcf0175-07a3-4df3-8289-7b950a1de32b"
steps,forecast-challenger02-model


### Replace Pipeline Step with Random Step

A 2:1:1 weighted random split - control will get 50% of the inference requests, the other two models 25% each.

In [67]:
inferencedata = json.load(open("./data/testdata_dict.json"))

results = pipeline.infer(inferencedata)

import time
import datetime

start_time = datetime.datetime.now()

time.sleep(5)

results = pipeline.infer(inferencedata)

end_time = datetime.datetime.now()

display(results)

[{'forecast': [1764, 1749, 1743, 1741, 1740, 1740, 1740]}]

We can see this by looking at the pipeline logs and using the `metadata` filter to retrieve the logs.

In [68]:
logs = pipeline.logs(dataset=["time", "out.json","metadata"])
display(logs)

Unnamed: 0,time,out.json,metadata.last_model,metadata.pipeline_version,metadata.elapsed,metadata.dropped
0,2023-07-26 20:46:40.832,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}","{""model_name"":""forecast-control-model"",""model_sha"":""525ea2be4402725878382631c2c32b2e3f105bf78eedf41f3ac6d71c0dfa986b""}",5bf788b1-fa77-4fdb-a837-409af9c19e2d,"[23900, 22865945]",[]
1,2023-07-26 20:46:35.399,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}","{""model_name"":""forecast-control-model"",""model_sha"":""525ea2be4402725878382631c2c32b2e3f105bf78eedf41f3ac6d71c0dfa986b""}",5bf788b1-fa77-4fdb-a837-409af9c19e2d,"[13200, 26798253]",[]
2,2023-07-26 20:46:27.009,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}","{""model_name"":""forecast-control-model"",""model_sha"":""525ea2be4402725878382631c2c32b2e3f105bf78eedf41f3ac6d71c0dfa986b""}",4b529632-8f30-4ae6-a588-8bca3971f20f,"[57200, 28982572]",[]
3,2023-07-26 19:39:41.466,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}","{""model_name"":""forecast-control-model"",""model_sha"":""525ea2be4402725878382631c2c32b2e3f105bf78eedf41f3ac6d71c0dfa986b""}",659821ab-d0ad-40c2-984f-5052757c0782,"[13400, 29222224]",[]
4,2023-07-26 19:39:35.730,"{""forecast"":[1814,1814,1814,1814,1814,1814,1814]}","{""model_name"":""forecast-challenger02-model"",""model_sha"":""c740dbb02a650178065a7dd3d82b88b51d95dcc3fb90a02082389465f4a1a35e""}",659821ab-d0ad-40c2-984f-5052757c0782,"[14500, 31617266]",[]
5,2023-07-26 19:39:08.625,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}","{""model_name"":""forecast-control-model"",""model_sha"":""525ea2be4402725878382631c2c32b2e3f105bf78eedf41f3ac6d71c0dfa986b""}",,"[67600, 27882155]",[]


The following will retrieves the `model_name` parameter from the `metadata.last_model` metadata.

In [69]:
def get_log_model(df: pd.DataFrame):
    return df['metadata.last_model'].apply(lambda x: json.loads(x)['model_name'])

In [70]:
logs['model'] = get_log_model(logs)

logs.loc[:, ["time", "out.json", "model"]]

Unnamed: 0,time,out.json,model
0,2023-07-26 20:46:40.832,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}",forecast-control-model
1,2023-07-26 20:46:35.399,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}",forecast-control-model
2,2023-07-26 20:46:27.009,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}",forecast-control-model
3,2023-07-26 19:39:41.466,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}",forecast-control-model
4,2023-07-26 19:39:35.730,"{""forecast"":[1814,1814,1814,1814,1814,1814,1814]}",forecast-challenger02-model
5,2023-07-26 19:39:08.625,"{""forecast"":[1764,1749,1743,1741,1740,1740,1740]}",forecast-control-model


### Undeploy the Pipeline

Undeploy the pipeline and return the resources back to the Wallaroo instance.

In [71]:
pipeline.undeploy()

0,1
name,forecast-workshop-pipeline
created,2023-07-26 19:38:56.059951+00:00
last_updated,2023-07-26 20:46:28.006211+00:00
deployed,False
tags,
versions,"5bf788b1-fa77-4fdb-a837-409af9c19e2d, 4b529632-8f30-4ae6-a588-8bca3971f20f, 659821ab-d0ad-40c2-984f-5052757c0782, 8dcf0175-07a3-4df3-8289-7b950a1de32b"
steps,forecast-challenger02-model
