# Guided Exercise: Drift

### Setup:
You are the principal data scientist working for a new startup that offers a price recommendation for rental home listings. Your beach-head market was San Francisco and this was where you trained the model, which is the core service of the business. But now, the startup is looking to expand into Seattle and Austin. Using the mean price difference between San Francisco and each new city, you want to make sure your price recommendations don't drift. If they drift too low, your customers will leave money on the table; if they drift too high, their listing will be vacant. Hitting the goldilox zone is critical for acquiring and keeping happy customers in Seattle.

Competitors in Seattle are within 65 dollars of the ideal price, and due to stiffer competition, competitors in Austin are within $40 of the ideal price. These are the benchmarks we need to hit to prove a viable product.

#### Goals 🎯

In this tutorial, you will learn how to:
1. Set up and view the results of stability tests.
2. Debug the true cause of stability issues.
3. Retest the new model and confirm the effectivenesss of the mitigation strategy.

### First, set the credentials for your TruEra deployment.

If you don't have credentials yet, get them by signing up for the free private beta: https://go.truera.com/diagnostics-free

In [None]:
#connection details
CONNECTION_STRING = ""
AUTH_TOKEN = ""

### Install required packages

In [None]:
! pip install --upgrade shap
! pip install --upgrade truera

### From here, run the rest of the notebook and follow the analysis.

### First, load data and train the in your beach-head market, San Francisco. Also add additional data for Seattle and Austin, your target markets.

In [None]:
import pandas as pd
import xgboost as xgb
from sklearn import preprocessing
import sklearn.metrics
from sklearn.utils import resample
import logging

from truera.client.truera_workspace import TrueraWorkspace
from truera.client.truera_authentication import TokenAuthentication

auth = TokenAuthentication(AUTH_TOKEN)
tru = TrueraWorkspace(CONNECTION_STRING, auth, ignore_version_mismatch=True, log_level=logging.ERROR)

# set our environmetn to local compute so we can compute predictions and feature influences on our local machine
tru.set_environment("local")
# note: we'll periodically toggle between local and remote so we can interact with our remote deployment as well.

In [None]:
# load data
san_francisco = pd.read_csv('https://truera-examples.s3.us-west-2.amazonaws.com/data/starter-stability/San_Francisco_for_stability.csv')
seattle = pd.read_csv('https://truera-examples.s3.us-west-2.amazonaws.com/data/starter-stability/Seattle_for_stability.csv')
austin = pd.read_csv('https://truera-examples.s3.us-west-2.amazonaws.com/data/starter-stability/Austin_for_stability.csv')

# train first model
xgb_reg = xgb.XGBRegressor(eta = 0.2, max_depth = 4)
xgb_reg.fit(san_francisco.drop('price', axis = 1), san_francisco.price)

# create the first project and data collection
tru.add_project("Starter Example - Drift", score_type = 'regression')
tru.add_data_collection("Data Collection v1")

# add data splits to the collection we just created
tru.add_data_split("San Francisco", pre_data = san_francisco.drop('price', axis = 1), label_data = san_francisco['price'], split_type = "train")
tru.add_data_split("Seattle", pre_data = seattle.drop('price', axis = 1), label_data = seattle['price'], split_type = "test")
tru.add_data_split("Austin", pre_data = austin.drop('price', axis = 1), label_data = austin['price'], split_type = "test")

# register the model
tru.add_python_model("model_1", xgb_reg, train_split_name="San Francisco", train_parameters = {"model_type":"xgb.XGBRegressor", "eta":0.2, "max_depth":4})

# sync with remote
tru.upload_project()





### Get the average ground truth price in each city to use for defining our stability test thresholds.

In [None]:
tru.set_data_split("San Francisco")
San_Francisco_mean_price = tru.get_ys().mean()
tru.set_data_split("Seattle")
Seattle_mean_price = tru.get_ys().mean()
tru.set_data_split("Austin")
Austin_mean_price = tru.get_ys().mean()

print("San Francisco mean listing price: " + str(San_Francisco_mean_price))
print("Seattle mean listing price: " + str(Seattle_mean_price))
print("Austin mean listing price: " + str(Austin_mean_price))

#calculate expected difference in price recommendations from beach-head to target market
Seattle_expected_difference = Seattle_mean_price - San_Francisco_mean_price
Austin_expected_difference = Austin_mean_price - San_Francisco_mean_price

print("Expected price difference from San Francisco to Seattle: " + str(Seattle_expected_difference))
print("Expected price difference from San Francisco to Austin: " + str(Austin_expected_difference))


San Francisco mean listing price: 205.2558100370495
Seattle mean listing price: 127.80739963264234
Austin mean listing price: 227.01126421697288
Expected price difference from San Francisco to Seattle: -77.44841040440717
Expected price difference from San Francisco to Austin: 21.755454179923362


### Test for stability in Seattle and Austin.

In [None]:
#toggle back to remote to interact with the tester

# add stability test
tru.set_environment("remote")
tru.set_project("Starter Example - Drift")
tru.set_data_collection("Data Collection v1")

# Create stability tests in accordance with the setup
tru.tester.add_stability_test(test_name = "Stability Test - Seattle",
    base_data_split_name = "San Francisco",
    comparison_data_split_name_regex = "Seattle",
    fail_if_outside = [Seattle_expected_difference - 65, Seattle_expected_difference + 65])

tru.tester.add_stability_test(test_name = "Stability Test - Austin",
    base_data_split_name = "San Francisco",
    comparison_data_split_names = ["Austin"],
    fail_if_outside = [Austin_expected_difference - 40, Austin_expected_difference + 40])

tru.set_model("model_1")
tru.tester.get_model_test_results(test_types=["stability"])

0,1,2,3,4,5,6,7
,Name,Comparison Split,Base Split,Segment,Metric,Score,Navigate
❌,Stability Test - Seattle,Seattle,San Francisco,ALL POINTS,DIFFERENCE_OF_MEAN,-2.1431,Explore in UI
❌,Stability Test - Austin,Austin,San Francisco,ALL POINTS,DIFFERENCE_OF_MEAN,62.1055,Explore in UI


The model fails in Seattle and Austin because the scores drifted too far from the ground truth in the new cities.

### From here, navigate to the TruEra Web App for analysis or continue on to Part 2!     [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1SIshdf_nE2dCWPdGNfUJ3UUuWgbocANn)