## Model Validation - Top Feature Stress Testing
- Connects to project and selected model
- Selects top n features by feature impact for selected model
- User must imput extreme feature values outside of the range of input values for each of the top features.
- Extreme values must satisfy model risk assessments and be based off of domain knowledge
- Stress test is an extension of the partial dependence test to extreme values

In [4]:
import pandas as pd
import random
import numpy as np
import datarobot as dr
import time
import warnings
warnings.filterwarnings('ignore')
import re
import stress_test

In [5]:
# initialize datarobot client instance
dr.Client(config_path='/Users/benjamin.miller/.config/datarobot/my_drconfig.yaml')
# User inputs for project and selected model
PROJECT_ID = '5ac3dc2ad668761ef4ddb463'
MODEL_ID = '5ac3dd8c12019100d61d22f0'

In [6]:
# read in training data
df = pd.read_csv('/Users/benjamin.miller/src/demo_data/10K_NBA_2017-2018.csv')

In [7]:
# get sample of 1000 rows of data for calculated partial dependence
sample = df.sample(n=1000)

In [8]:
# get top n features
n = 6
# Get the top n features for the model
top_features = stress_test.get_top_n_features(PROJECT_ID, MODEL_ID, n)

Model type: eXtreme Gradient Boosted Trees Regressor with Early Stopping
Requesting feture impact for model 5ac3dd8c12019100d61d22f0
Feature impact compute done.
- Time: 'get_top_n_features' 3.99


In [10]:
print(top_features)

['text_yesterday_and_today', 'roto_fpts', 'minutes_played_decay1_mean', 'total_rebound_percent_decay1_std', 'game_score_lag30_mean', 'spread_decay1_mean']


In [11]:
# User must input values extreme values to stress the model
# Values must be outside the range of the training data and be of the same format
d = {
    'text_yesterday_and_today': ['jfkldfjkds;a jfkdl;j;af nonsense',
                                 'injury to lower left orbital pluto neptune',
                                 'on a planned flight to the moon',
                                 'michael jordan is the best',
                                 '84883 939393 9393959 2224-999::'],
    'roto_fpts': [-10, -1, 100, 200, 1000],
    'minutes_played_decay1_mean': [-10, -1, 50, 75, 100],
    'total_rebound_percent_decay1_std': [-10, -1, 25, 75, 100],
    'game_score_lag30_mean': [-100, -10, 100, 150, 200],
    'spread_decay1_mean': [-50, -25, 25, 50, 75]
}

In [14]:
# partial dependence for top n features
k_v_avg_pred = []
# loop through all features and values for stress testing
project = dr.Project(id=PROJECT_ID)
for k in d:
    for v in d[k]:
        # get selected model
        model = dr.Model.get(project=PROJECT_ID, model_id=MODEL_ID)
        # reset dataset
        data = sample.copy()
        data[k] = v
        # upload as stress_data
        stress_data = project.upload_dataset(data)
        # start a predict job
        predict_job = model.request_predictions(stress_data.id)
        # get job status every 5 seconds and move on once 'inprogress'
        for i in range(100):
            time.sleep(5)
            try:
                job_status = dr.PredictJob.get(
                    project_id=PROJECT_ID, 
                    predict_job_id=predict_job.id
                ).status
            except:  # normally the job_status would produce an error when it is completed
                print(k, v, 'predictions finished')
                break
        # now the predictions are finished
        predictions = dr.PredictJob.get_predictions(
            project_id=PROJECT_ID,
            predict_job_id=predict_job.id
        )
        k_v_avg_pred.append((k, v, predictions.prediction.mean()))

text_yesterday_and_today jfkldfjkds;a jfkdl;j;af nonsense predictions finished
text_yesterday_and_today injury to lower left orbital pluto neptune predictions finished
text_yesterday_and_today on a planned flight to the moon predictions finished
text_yesterday_and_today michael jordan is the best predictions finished
text_yesterday_and_today 84883 939393 9393959 2224-999:: predictions finished
roto_fpts -10 predictions finished
roto_fpts -1 predictions finished
roto_fpts 100 predictions finished
roto_fpts 200 predictions finished
roto_fpts 1000 predictions finished
minutes_played_decay1_mean -10 predictions finished
minutes_played_decay1_mean -1 predictions finished
minutes_played_decay1_mean 50 predictions finished
minutes_played_decay1_mean 75 predictions finished
minutes_played_decay1_mean 100 predictions finished
total_rebound_percent_decay1_std -10 predictions finished
total_rebound_percent_decay1_std -1 predictions finished
total_rebound_percent_decay1_std 25 predictions finished

In [17]:
# format output as a dataframe
df_out = pd.DataFrame(k_v_avg_pred, columns=['feature', 'value', 'prediction'])

In [18]:
df_out

Unnamed: 0,feature,value,prediction
0,text_yesterday_and_today,jfkldfjkds;a jfkdl;j;af nonsense,10.951278
1,text_yesterday_and_today,injury to lower left orbital pluto neptune,9.990803
2,text_yesterday_and_today,on a planned flight to the moon,11.539843
3,text_yesterday_and_today,michael jordan is the best,9.619647
4,text_yesterday_and_today,84883 939393 9393959 2224-999::,10.951278
5,roto_fpts,-10,10.921985
6,roto_fpts,-1,10.921985
7,roto_fpts,100,12.874027
8,roto_fpts,200,12.874027
9,roto_fpts,1000,12.874027


In [20]:
# write output as csv
df_out.to_csv('stress_test_example_output.csv')