# Guided Exercise: Performance
#### Goals 🎯
In this tutorial, you will use TruEra to train and ingest a model, and then make performance improvements to our model in a structured and methodical way!

In this tutorial, you will:
1. Set up and view the results of performance and feature importance tests.
2. Find actionable issues with the model.
3. Mitigate these issues and re-upload your model to TruEra.
4. Retest the new model and confirm the effectivenesss of the mitigation strategy.

### First, set the credentials for your TruEra deployment.

If you don't have credentials yet, get them by signing up for the free private beta: https://go.truera.com/diagnostics-free

In [None]:
#connection details
CONNECTION_STRING = ""
AUTH_TOKEN = ""

### Install required packages

In [None]:
! pip install --upgrade shap
! pip install --upgrade truera

### From here, run the rest of the notebook and follow the analysis.

In [None]:
import pandas as pd
import xgboost as xgb
import logging

from truera.client.truera_workspace import TrueraWorkspace
from truera.client.truera_authentication import TokenAuthentication

auth = TokenAuthentication(AUTH_TOKEN)
tru = TrueraWorkspace(CONNECTION_STRING, auth, ignore_version_mismatch = True, log_level=logging.ERROR)

# set our environment to local compute so we can compute predictions and feature influences on our local machine
tru.set_environment("local")
# note: we'll periodically toggle between local and remote so we can interact with our remote deployment as well.

### Load the data and train an xgboost model
A bit about the model and data... 

In this example, we will use real data on the AirBnb listings 🏠 in San Francisco and Seattle to predict the listing price. The Airbnb data was scraped by Inside Airbnb and hosted by OpenDataSoft. Pricing a rental property is a challenging task for Airbnb owners as they need to understand the market, the features of their property, and how those features contribute to listing price.

You can find more information about the data here:
https://data.opendatasoft.com/explore/dataset/airbnb-listings%40public/

In [None]:
# load data
san_francisco = pd.read_csv('https://truera-examples.s3.us-west-2.amazonaws.com/data/starter-performance/San_Francisco.csv')
seattle = pd.read_csv('https://truera-examples.s3.us-west-2.amazonaws.com/data/starter-performance/Seattle.csv')

# train first model
xgb_reg = xgb.XGBRegressor(eta = 0.2, max_depth = 4)
xgb_reg.fit(san_francisco.drop('price', axis = 1), san_francisco.price)

# create the first project and data collection
tru.add_project("Starter Example - Performance", score_type = 'regression')
tru.add_data_collection("Data Collection v1")

# add data splits to the collection we just created
tru.add_data_split("San Francisco", pre_data = san_francisco.drop('price', axis = 1), label_data = san_francisco['price'], split_type = "train")
tru.add_data_split("Seattle", pre_data = seattle.drop('price', axis = 1), label_data = seattle['price'], split_type = "test")

# register the model
tru.add_python_model("model_1", xgb_reg, train_split_name="San Francisco", train_parameters = {"model_type":"xgb.XGBRegressor", "eta":0.2, "max_depth":4})

# sync with remote
tru.upload_project()





### Issue: Overfitting

We observe there to be a large discrepency between our train and test accuracy!

In [None]:
# toggle to remote to interact with the tester
tru.set_environment("remote")
tru.set_project("Starter Example - Performance")
tru.set_data_collection("Data Collection v1")
tru.set_model("model_1")

train_split_name = "San Francisco"
test_split_name = "Seattle"

# generate the explainer and compute performance
explainer = tru.get_explainer(test_split_name, comparison_data_splits=[train_split_name])
explainer.compute_performance(metric_type="MAE")

Unnamed: 0,Split,MAE
0,Seattle,123.479141
1,San Francisco,51.163929


To help us keep track of this issue, let's add a test for it!

Additionally, too many unimportant features is a common cause of overfitting. We should test for that as well.

Note that we could also set up tests for fairness and stability.

In [None]:
# add performance tests
tru.tester.add_performance_test(
    test_name='Relative MAE Test',
    all_data_collections=True,
    data_split_name_regex='Seattle',
    metric="MAE",
    reference_split_name=train_split_name,
    fail_if_greater_than=0.75, # will fail if the MAE on data is > (1 + 0.75) * MAE of train_split_name
    fail_threshold_type="RELATIVE"
)

tru.tester.add_performance_test(
    test_name='RMSE Test',
    all_data_collections=True,
    data_split_name_regex='.*',
    metric="RMSE",
    fail_if_greater_than=110, # will fail if the RMSE on data is > 110
    fail_threshold_type="ABSOLUTE"
)
# get model results
tru.set_model("model_1")
tru.tester.get_model_test_results(test_types=["performance"])

0,1,2,3,4,5,6
,Name,Split,Segment,Metric,Score,Navigate
❌,Relative MAE Test,Seattle,ALL POINTS,MAE,123.4791,Explore in UI
❌,RMSE Test,Seattle,ALL POINTS,RMSE,161.2632,Explore in UI
✅,RMSE Test,San Francisco,ALL POINTS,RMSE,82.4993,Explore in UI


### Both tests are failing.

### From here, navigate to the TruEra Web App for analysis or continue on to Part 2!     [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/16DexGCY1i4A5fLJZXC7xHPpqCSrQhVab)