# Notebooks contents & summary

Running this notebook will give you the equivalent of the snapshots and scans in the DemoAccount.

### SET-UP
 - import 
 - login to dashboard
 - open project


### SNAPSHOT 1: xgboost, pre-configured model
 - SCANS:
  - accuracy 
  - bias metrics


### SNAPSHOT 2: xgboost, pre-configured model
 - SCANS:
  - bias sources 
  - accuracy 
  - bias metrics
  - leakage
    
### MODEL FIXES 

### SNAPSHOT 3: xgboost, new version, pre-configured model
 - SCANS:
  - bias sources 
  - accuracy 
  - bias metrics 
  - leakage


### SNAPSHOT 4: xgboost, new version, pre-configured model, added dataset for previous time period
 - SCANS:
  - bias sources 
  - accuracy 
  - bias metrics 
  - leakage
  - feature drift 
	
### SNAPSHOT 5: user defined model  
 - SCANS: 
  - accuracy 
  - bias metrics 
  - leakage
 

### SNAPSHOT 6: production 
  - SCANS: 
   - feature drift 
   - applicable bias metrics 



# SET-UP

In [1]:
import etiq


Thanks for trying out the ETIQ.ai toolkit!

Visit our getting started documentation at https://docs.etiq.ai/

Visit our Slack channel at https://etiqcore.slack.com/ for support or feedback.



In [2]:
from etiq import login as etiq_login
etiq_login("https://dashboard.etiq.ai/", "<token>")


(Dashboard supplied updated license information)


Connection successful. Projects and pipelines will be displayed in the dashboard. 😀

In [3]:
# Can enumerate all available projects
all_projects = etiq.projects.get_all_projects()
print(all_projects)

[]


In [4]:
# Can get/create a single named project
project = etiq.projects.open(name="Demo Project")

# SNAPSHOT 1: xgboost, pre-configured model


To illustrate some of the library's features, we build a model that predicts whether an applicant makes over or under 50K using the Adult dataset from https://archive.ics.uci.edu/ml/datasets/adult.


First, we'll be encoding the categorical features found in this dataset.

Second, we'll log the dataset to Etiq.

In this case we encode prior to splitting into test/train/validate because we know in advance the categories people fall into for this dataset. This means that in production we won't run into new categories that will fall into a bucket not included in this dataset, This allows us to encode prior to splitting into train/test/validation.

However if this is not the case for your use case, you should NOT encode prior to splitting your sample, as this might lead to LEAKAGE.

Encoding categorical values itself is problematic as it assigns a numerical ranking to categorical variables. For best practice encoding use one hot encoding. As we limit the free library functionality to 15 features, we will not do one-hot encoding for the purposes of this example.

Remember: This is an example only. The use case for the majority of scans in Etiq is that you log the model to Etiq once you have the sample that you'll be training on. Usually this sample will have numeric features only as otherwise you will not be able to use it in with the majority of supported libraries training methods.

In [5]:
# Loading a dataset. We're using the adult dataset
data = etiq.utils.load_sample("adultdata")
data.head()


Unnamed: 0,age,workclass,fnlwgt,education,educational-num,marital-status,occupation,relationship,race,gender,capital-gain,capital-loss,hours-per-week,native-country,income
0,25,Private,226802,11th,7,Never-married,Machine-op-inspct,Own-child,Black,Male,0,0,40,United-States,<=50K
1,38,Private,89814,HS-grad,9,Married-civ-spouse,Farming-fishing,Husband,White,Male,0,0,50,United-States,<=50K
2,28,Local-gov,336951,Assoc-acdm,12,Married-civ-spouse,Protective-serv,Husband,White,Male,0,0,40,United-States,>50K
3,44,Private,160323,Some-college,10,Married-civ-spouse,Machine-op-inspct,Husband,Black,Male,7688,0,40,United-States,>50K
4,18,?,103497,Some-college,10,Never-married,?,Own-child,White,Female,0,0,30,United-States,<=50K


In [6]:
from etiq.transforms import LabelEncoder
import pandas as pd
import numpy as np 

# use a LabelEncoder to transform categorical variables
cont_vars = ['age', 'educational-num', 'fnlwgt', 'capital-gain', 'capital-loss', 'hours-per-week']
cat_vars = list(set(data.columns.values) - set(cont_vars))

label_encoders = {}
data_encoded = pd.DataFrame()
for i in cat_vars:
    label = LabelEncoder()
    data_encoded[i] = label.fit_transform(data[i])
    label_encoders[i] = label

data_encoded.set_index(data.index, inplace=True)
data_encoded = pd.concat([data.loc[:, cont_vars], data_encoded], axis=1).copy()


## Loading the config file

In [7]:
# XXX: Make per-project.
etiq.load_config("./config_demo_snapshots.json")


{'dataset': {'label': 'income',
  'bias_params': {'protected': 'gender',
   'privileged': 1,
   'unprivileged': 0,
   'positive_outcome_label': 1,
   'negative_outcome_label': 0},
  'train_valid_test_splits': [0.8, 0.1, 0.1],
  'cat_col': 'cat_vars',
  'cont_col': 'cont_vars'},
 'scan_accuracy_metrics': {'thresholds': {'accuracy': [0.8, 1.0],
   'true_pos_rate': [0.7, 1.0],
   'true_neg_rate': [0.6, 1.0]}},
 'scan_bias_metrics': {'thresholds': {'equal_opportunity': [0.0, 0.2],
   'demographic_parity': [0.0, 0.2],
   'equal_odds_tnr': [0.0, 0.2],
   'individual_fairness': [0.0, 0.2],
   'equal_odds_tpr': [0.0, 0.2]}},
 'scan_bias_sources': {'auto': True},
 'scan_leakage': {'leakage_threshold': 0.85},
 'scan_drift_metrics': {'thresholds': {'psi': [0.0, 0.15],
   'kolmogorov_smirnov': [0.05, 1.0]},
  'drift_measures': ['kolmogorov_smirnov', 'psi']}}

## Logging the snapshot to Etiq 

This can happen at any point in the pipeline and through a variety of ways

In [8]:
#load your dataset

dataset_loader = etiq.dataset(data_encoded)

from etiq.model import DefaultXGBoostClassifier
# Load our model
model = DefaultXGBoostClassifier()

# Creating a snapshot
snapshot = project.snapshots.create(name="Snapshot 1", dataset=dataset_loader.initial_dataset, model=model, bias_params=dataset_loader.bias_params)


## Start scanning for errors

## Accuracy Metrics

In [9]:
(segments, issues, issue_summary) = snapshot.scan_accuracy_metrics()

INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0557:Starting pipeline
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0557:Computed acurracy metrics for the dataset
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0557:Completed pipeline


In [10]:
issue_summary

Unnamed: 0,name,metric,measure,features,segments,total_issues_tested,issues_found,threshold
0,accuracy_below_threshold,<function accuracy at 0x7f7e4b1e8160>,,{},{},1,0,"[0.8, 1.0]"
1,accuracy_above_threshold,<function accuracy at 0x7f7e4b1e8160>,,{},{},1,0,"[0.8, 1.0]"
2,true_pos_rate_below_threshold,<function true_pos_rate at 0x7f7e4b1e81f0>,,{},{all},1,1,"[0.7, 1.0]"
3,true_pos_rate_above_threshold,<function true_pos_rate at 0x7f7e4b1e81f0>,,{},{},1,0,"[0.7, 1.0]"
4,true_neg_rate_below_threshold,<function true_neg_rate at 0x7f7e4b1e8280>,,{},{},1,0,"[0.6, 1.0]"
5,true_neg_rate_above_threshold,<function true_neg_rate at 0x7f7e4b1e8280>,,{},{},1,0,"[0.6, 1.0]"


## Bias Metrics 

In [11]:
#scan_bias_metrics

(segments, issues, issue_summary) = snapshot.scan_bias_metrics()

INFO:etiq.pipeline.BiasMetricsIssuePipeline0680:Starting pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0680:Computed bias metrics for the dataset
INFO:etiq.pipeline.BiasMetricsIssuePipeline0680:Completed pipeline


In [12]:
issues

# Leakage Scan

In [13]:
(segments, issues, issue_summary) = snapshot.scan_leakage()

INFO:etiq.pipeline.DataPipeline0530:Starting pipeline
INFO:etiq.pipeline.DataPipeline0530:Computed metrics for the initial dataset
INFO:etiq.pipeline.DataPipeline0530:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0681:Starting pipeline
INFO:etiq.pipeline.DebiasPipeline0681:Start Phase IdentifyFeatureLeakPipeline0673
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0673:Using parent model
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0673:Starting pipeline
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0673:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0681:Completed Phase IdentifyFeatureLeakPipeline0673
INFO:etiq.pipeline.DebiasPipeline0681:Completed pipeline


In [14]:
issue_summary

Unnamed: 0,name,metric,measure,features,segments,total_issues_tested,issues_found,threshold
0,target_leakage_issue,,<function corrcoef at 0x7f7eb807cb80>,{},{},13,0,"(0, 0.85)"
1,demographic_leakage_issue,,<function corrcoef at 0x7f7eb807cb80>,{},{},13,0,"(0, 0.85)"


# SNAPSHOT 2: xgboost, pre-configured  

In [15]:
#Load your dataset
#For bias sources you need to add some specific syntax at the moment 

dataset_loader = etiq.dataset(data_encoded)
dl = etiq.dataset_loader.DatasetLoader(data=data_encoded, label='income', bias_params=dataset_loader.bias_params,
                   train_valid_test_splits=[0.8, 0.1, 0.1], cat_col=cat_vars,
                   cont_col=cont_vars, names_col = data_encoded.columns.values)

from etiq.model import DefaultXGBoostClassifier
# Load our model
model = DefaultXGBoostClassifier()

# Creating a snapshot
snapshot = project.snapshots.create(name="Snapshot 2", dataset=dl.initial_dataset, model=model, bias_params=dataset_loader.bias_params)

In [16]:
#Bias metrics scans, accuracy metrics scans, data leakage scans


(segments, issues, issue_summary) = snapshot.scan_accuracy_metrics()

(segments, issues, issue_summary) = snapshot.scan_bias_metrics()


(segments, issues, issue_summary) = snapshot.scan_leakage()



INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0975:Starting pipeline
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0975:Computed acurracy metrics for the dataset
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0975:Completed pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0593:Starting pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0593:Computed bias metrics for the dataset
INFO:etiq.pipeline.BiasMetricsIssuePipeline0593:Completed pipeline
INFO:etiq.pipeline.DataPipeline0366:Starting pipeline
INFO:etiq.pipeline.DataPipeline0366:Computed metrics for the initial dataset
INFO:etiq.pipeline.DataPipeline0366:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0790:Starting pipeline
INFO:etiq.pipeline.DebiasPipeline0790:Start Phase IdentifyFeatureLeakPipeline0704
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0704:Using parent model
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0704:Starting pipeline
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0704:Completed pipeline
INFO:etiq.pip

In [17]:
(segments, issues, issue_summary) = snapshot.scan_bias_sources()

INFO:etiq.pipeline.DataPipeline0693:Starting pipeline
INFO:etiq.pipeline.DataPipeline0693:Computed metrics for the initial dataset
INFO:etiq.pipeline.DataPipeline0693:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0187:Starting pipeline
INFO:etiq.pipeline.DebiasPipeline0187:Start Phase IdentifyPipeline0701
INFO:etiq.pipeline.IdentifyPipeline0701:Using parent model
INFO:etiq.pipeline.IdentifyPipeline0701:Starting pipeline
INFO:etiq.pipeline.IdentifyPipeline0701:Checking proxy for feature age
INFO:etiq.pipeline.IdentifyPipeline0701:Checking correlation for feature age
INFO:etiq.pipeline.IdentifyPipeline0701:Checking proxy for feature educational-num
INFO:etiq.pipeline.IdentifyPipeline0701:Checking correlation for feature educational-num
INFO:etiq.pipeline.IdentifyPipeline0701:Checking proxy for feature fnlwgt
INFO:etiq.pipeline.IdentifyPipeline0701:Checking correlation for feature fnlwgt
INFO:etiq.pipeline.IdentifyPipeline0701:Checking proxy for feature capital-gain
INFO:etiq.pipeli

  return {'accuracy': round((pred == label).mean(), 2)}
  ret = ret.dtype.type(ret / rcount)
  return {'accuracy': round((pred == label).mean(), 2)}
  ret = ret.dtype.type(ret / rcount)
  return {'accuracy': round((pred == label).mean(), 2)}
  ret = ret.dtype.type(ret / rcount)
  return {'accuracy': round((pred == label).mean(), 2)}
  ret = ret.dtype.type(ret / rcount)
  return {'accuracy': round((pred == label).mean(), 2)}
  ret = ret.dtype.type(ret / rcount)
  pk = 1.0*pk / np.sum(pk, axis=axis, keepdims=True)
  c /= stddev[:, None]
  c /= stddev[None, :]
  avg = a.mean(axis)
  ret = um.true_divide(
  c = cov(x, y, rowvar, dtype=dtype)
  c *= np.true_divide(1, fact)
  c *= np.true_divide(1, fact)


INFO:etiq.pipeline.IdentifyPipeline0701:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0187:Completed Phase IdentifyPipeline0701
INFO:etiq.pipeline.DebiasPipeline0187:Computed metrics for the initial dataset
INFO:etiq.pipeline.DebiasPipeline0187:Completed pipeline


In [18]:
issues

Unnamed: 0,name,feature,segment,measure,measure_value,metric,metric_value,threshold
0,low_unpriv_sample,,0.0,,,,,"(0.0, 0.8)"
1,low_unpriv_sample,,1.0,,,,,"(0.0, 0.8)"
2,low_unpriv_sample,,2.0,,,,,"(0.0, 0.8)"
3,low_unpriv_sample,,3.0,,,,,"(0.0, 0.8)"
4,low_unpriv_sample,,4.0,,,,,"(0.0, 0.8)"
...,...,...,...,...,...,...,...,...
197,correlation_issue,relationship,20.0,<function corrcoef at 0x7f7eb807cb80>,,,,"(0.0, 0.2)"
198,correlation_issue,occupation,20.0,<function corrcoef at 0x7f7eb807cb80>,,,,"(0.0, 0.2)"
199,limited_features_issue,,3.0,,,<function equal_opportunity at 0x7f7e4b1e84c0>,0.325000,"(0.0, 0.2)"
200,limited_features_issue,,4.0,,,<function equal_opportunity at 0x7f7e4b1e84c0>,0.234568,"(0.0, 0.2)"


In [19]:
issue_summary

Unnamed: 0,name,metric,measure,features,segments,total_issues_tested,issues_found,threshold
0,missing_sample,,,{},"{5, 6, 7, 19, 20}",21,5,"(0.0, 0.0)"
1,low_unpriv_sample,,,{},"{0, 1, 2, 3, 4, 8, 9, 10, 11, 12, 13, 14, 15, ...",16,16,"(0.0, 0.8)"
2,low_priv_sample,,,{},{},16,0,"(0.0, 0.8)"
3,skewed_priv_sample,,,{},{},14,0,"(0.0, 0.2)"
4,skewed_unpriv_sample,,,{},"{10, 4}",16,2,"(0.0, 0.2)"
5,proxy_issue,,<function corrcoef at 0x7f7eb807cb80>,{relationship},"{3, 4, 8, 10, 12, 16, 17, 18}",273,8,"(0.0, 0.5)"
6,correlation_issue,,<function corrcoef at 0x7f7eb807cb80>,"{marital-status, educational-num, age, native-...","{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...",273,168,"(0.0, 0.2)"
7,low_volume_group,,,{},{},21,0,"(1000, inf)"
8,limited_features_issue,<function equal_opportunity at 0x7f7e4b1e84c0>,,{},"{10.0, 3.0, 4.0}",21,3,"(0.0, 0.2)"


In [20]:
pd.set_option('display.max_colwidth', None)

In [21]:
segments

Unnamed: 0,name,business_rule,mask
0,0,`native-country` == 39 and `occupation` == 6,"[False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
1,1,`native-country` == 39 and `occupation` == 1 and `education` == 11,"[False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
2,2,`native-country` == 39 and `occupation` == 14 and `workclass` == 4,"[False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, False, False, False, False, True, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
3,3,`native-country` == 39 and `occupation` == 7 and `workclass` == 4 and `education` == 11,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
4,4,`native-country` == 39 and `occupation` == 13,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, True, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, True, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, True, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
5,5,`occupation` == 4 and `race` == 4 and `native-country` == 39 and `workclass` == 4 and `relationship` == 0,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
6,6,`occupation` == 3 and `workclass` == 4 and `marital-status` == 2 and `native-country` == 39 and `relationship` == 0,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, True, False, False, False, False, ...]"
7,7,`occupation` == 10 and `native-country` == 39 and `marital-status` == 2 and `relationship` == 0,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, ...]"
8,8,`native-country` == 39 and `education` == 11 and `race` == 2,"[False, False, True, False, True, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
9,9,`native-country` == 39 and `education` == 11 and `race` == 4 and `occupation` == 3,"[False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, True, False, False, False, False, ...]"


# SNAPSHOT 3: xgboost, new version, pre-configured

For this version we will remove relationship which was found as a proxy across most segments

In [22]:
#initial dataset
data.head()

Unnamed: 0,age,workclass,fnlwgt,education,educational-num,marital-status,occupation,relationship,race,gender,capital-gain,capital-loss,hours-per-week,native-country,income
0,25,Private,226802,11th,7,Never-married,Machine-op-inspct,Own-child,Black,Male,0,0,40,United-States,<=50K
1,38,Private,89814,HS-grad,9,Married-civ-spouse,Farming-fishing,Husband,White,Male,0,0,50,United-States,<=50K
2,28,Local-gov,336951,Assoc-acdm,12,Married-civ-spouse,Protective-serv,Husband,White,Male,0,0,40,United-States,>50K
3,44,Private,160323,Some-college,10,Married-civ-spouse,Machine-op-inspct,Husband,Black,Male,7688,0,40,United-States,>50K
4,18,?,103497,Some-college,10,Never-married,?,Own-child,White,Female,0,0,30,United-States,<=50K


In [23]:
#remove relationship

data_clean = data.drop('relationship', axis = 1)

data_clean.head()

Unnamed: 0,age,workclass,fnlwgt,education,educational-num,marital-status,occupation,race,gender,capital-gain,capital-loss,hours-per-week,native-country,income
0,25,Private,226802,11th,7,Never-married,Machine-op-inspct,Black,Male,0,0,40,United-States,<=50K
1,38,Private,89814,HS-grad,9,Married-civ-spouse,Farming-fishing,White,Male,0,0,50,United-States,<=50K
2,28,Local-gov,336951,Assoc-acdm,12,Married-civ-spouse,Protective-serv,White,Male,0,0,40,United-States,>50K
3,44,Private,160323,Some-college,10,Married-civ-spouse,Machine-op-inspct,Black,Male,7688,0,40,United-States,>50K
4,18,?,103497,Some-college,10,Never-married,?,White,Female,0,0,30,United-States,<=50K


In [24]:
from etiq.transforms import LabelEncoder
import pandas as pd
import numpy as np 

# use a LabelEncoder to transform categorical variables
cont_vars = ['age', 'educational-num', 'fnlwgt', 'capital-gain', 'capital-loss', 'hours-per-week']
cat_vars = list(set(data_clean.columns.values) - set(cont_vars))

label_encoders = {}
data_encoded = pd.DataFrame()
for i in cat_vars:
    label = LabelEncoder()
    data_encoded[i] = label.fit_transform(data_clean[i])
    label_encoders[i] = label

data_encoded.set_index(data_clean.index, inplace=True)
data_encoded = pd.concat([data_clean.loc[:, cont_vars], data_encoded], axis=1).copy()


In [25]:
#Load your dataset
#For bias sources you need to add some specific syntax at the moment 

dataset_loader = etiq.dataset(data_encoded)
dl = etiq.dataset_loader.DatasetLoader(data=data_encoded, label='income', bias_params=dataset_loader.bias_params,
                   train_valid_test_splits=[0.8, 0.1, 0.1], cat_col=cat_vars,
                   cont_col=cont_vars, names_col = data_encoded.columns.values)

from etiq.model import DefaultXGBoostClassifier
# Load our model
model = DefaultXGBoostClassifier()

# Creating a snapshot
snapshot = project.snapshots.create(name="Snapshot 3", dataset=dl.initial_dataset, model=model, bias_params=dataset_loader.bias_params)

In [26]:
#Bias metrics scans, accuracy metrics scans, data leakage scans


snapshot.scan_accuracy_metrics()

snapshot.scan_bias_metrics()

snapshot.scan_leakage()

snapshot.scan_bias_sources()

INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0687:Starting pipeline
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0687:Computed acurracy metrics for the dataset
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0687:Completed pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0121:Starting pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0121:Computed bias metrics for the dataset
INFO:etiq.pipeline.BiasMetricsIssuePipeline0121:Completed pipeline
INFO:etiq.pipeline.DataPipeline0869:Starting pipeline
INFO:etiq.pipeline.DataPipeline0869:Computed metrics for the initial dataset
INFO:etiq.pipeline.DataPipeline0869:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0069:Starting pipeline
INFO:etiq.pipeline.DebiasPipeline0069:Start Phase IdentifyFeatureLeakPipeline0528
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0528:Using parent model
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0528:Starting pipeline
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0528:Completed pipeline
INFO:etiq.pip

  c /= stddev[:, None]
  c /= stddev[None, :]


INFO:etiq.pipeline.IdentifyPipeline0780:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0472:Completed Phase IdentifyPipeline0780
INFO:etiq.pipeline.DebiasPipeline0472:Computed metrics for the initial dataset
INFO:etiq.pipeline.DebiasPipeline0472:Completed pipeline


(    name  \
 0      0   
 1      1   
 2      2   
 3      3   
 4      4   
 5      5   
 6      6   
 7      7   
 8      8   
 9      9   
 10    10   
 11    11   
 12    12   
 13    13   
 14    14   
 15    15   
 16    16   
 17    17   
 18    18   
 
                                                                                                   business_rule  \
 0                                                                  `native-country` == 39 and `occupation` == 6   
 1                                            `native-country` == 39 and `occupation` == 1 and `education` == 11   
 2                                            `native-country` == 39 and `occupation` == 14 and `workclass` == 4   
 3                       `native-country` == 39 and `occupation` == 7 and `workclass` == 4 and `education` == 11   
 4                                                                 `native-country` == 39 and `occupation` == 13   
 5                   `native-country` == 3

# SNAPSHOT 4, xgboost, pre-configured, + dataset from previous period

In [27]:

# Because we are still in build stage, this drift scan is to check whether there are differences between periods
# Create the "drifted" dataset from the pre-period - this is not part of the logging library
yesterday_dataset_df = data_encoded.copy()
yesterday_dataset_df["hours-per-week"] = yesterday_dataset_df["hours-per-week"].multiply(1.2)

yesterday_dataset_df.head()

Unnamed: 0,age,educational-num,fnlwgt,capital-gain,capital-loss,hours-per-week,marital-status,gender,native-country,workclass,education,income,race,occupation
0,25,7,226802,0,0,48.0,4,1,39,4,1,0,2,7
1,38,9,89814,0,0,60.0,2,1,39,4,11,0,4,5
2,28,12,336951,0,0,48.0,2,1,39,2,7,1,4,11
3,44,10,160323,7688,0,48.0,2,1,39,4,15,1,2,7
4,18,10,103497,0,0,36.0,4,0,39,0,15,0,4,0


In [28]:
# Create a dataset with the comparison data
#dataset_s = etiq.SimpleDatasetBuilder.from_dataframe(data_encoded, target_feature='income').build()

dataset_loader = etiq.dataset(data_encoded)


# Create a dataset with the data
yesterday_dataset_s = etiq.SimpleDatasetBuilder.from_dataframe(yesterday_dataset_df, target_feature='income').build()

from etiq.model import DefaultXGBoostClassifier
# Load our model
model = DefaultXGBoostClassifier()


# Creating a snapshot
snapshot = project.snapshots.create(name="Snapshot 4", dataset=dataset_loader.initial_dataset, comparison_dataset=yesterday_dataset_s, model=model, bias_params=dataset_loader.bias_params)



In [29]:
snapshot.scan_accuracy_metrics()

snapshot.scan_bias_metrics()

INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0217:Starting pipeline
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0217:Computed acurracy metrics for the dataset
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0217:Completed pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0415:Starting pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0415:Computed bias metrics for the dataset
INFO:etiq.pipeline.BiasMetricsIssuePipeline0415:Completed pipeline


(  name business_rule  \
 0  all           all   
 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             mask  
 0  [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, T

In [30]:
(segments, issues, issue_summary) = snapshot.scan_drift_metrics()

INFO:etiq.pipeline.DriftPipeline0245:Starting pipeline
INFO:etiq.pipeline.DriftPipeline0245:Calculated drift measures.
INFO:etiq.pipeline.DriftPipeline0245:Identifying drift measure issues.
INFO:etiq.pipeline.DriftPipeline0245:Completed pipeline


In [31]:
issues

Unnamed: 0,name,feature,segment,measure,measure_value,metric,metric_value,threshold
0,feature_drift_above_threshold,hours-per-week,all,<function psi at 0x7f7e4a652550>,2.087093,,,"[0.0, 0.15]"
1,feature_drift_below_threshold,hours-per-week,all,<function kolmogorov_smirnov at 0x7f7e4a6525e0>,0.0,,,"[0.05, 1.0]"


In [32]:
issue_summary

Unnamed: 0,name,metric,measure,features,segments,total_issues_tested,issues_found,threshold
0,feature_drift_below_threshold,,<function psi at 0x7f7e4a652550>,{},{},12,0,"[0.0, 0.15]"
1,feature_drift_above_threshold,,<function psi at 0x7f7e4a652550>,{hours-per-week},{all},12,1,"[0.0, 0.15]"
2,feature_drift_below_threshold,,<function kolmogorov_smirnov at 0x7f7e4a6525e0>,{hours-per-week},{all},12,1,"[0.05, 1.0]"
3,feature_drift_above_threshold,,<function kolmogorov_smirnov at 0x7f7e4a6525e0>,{},{},12,0,"[0.05, 1.0]"


# SNAPSHOT 5, already trained model, 

In [33]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from xgboost.sklearn import XGBClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import warnings
warnings.filterwarnings('ignore')

# Loading a dataset. We're using the adult dataset
data = etiq.utils.load_sample("adultdata")
data.head()



Unnamed: 0,age,workclass,fnlwgt,education,educational-num,marital-status,occupation,relationship,race,gender,capital-gain,capital-loss,hours-per-week,native-country,income
0,25,Private,226802,11th,7,Never-married,Machine-op-inspct,Own-child,Black,Male,0,0,40,United-States,<=50K
1,38,Private,89814,HS-grad,9,Married-civ-spouse,Farming-fishing,Husband,White,Male,0,0,50,United-States,<=50K
2,28,Local-gov,336951,Assoc-acdm,12,Married-civ-spouse,Protective-serv,Husband,White,Male,0,0,40,United-States,>50K
3,44,Private,160323,Some-college,10,Married-civ-spouse,Machine-op-inspct,Husband,Black,Male,7688,0,40,United-States,>50K
4,18,?,103497,Some-college,10,Never-married,?,Own-child,White,Female,0,0,30,United-States,<=50K


In [34]:
# use a LabelEncoder to transform categorical variables
cont_vars = ['age', 'educational-num', 'fnlwgt', 'capital-gain', 'capital-loss', 'hours-per-week']
cat_vars = list(set(data.columns.values) - set(cont_vars))

label_encoders = {}
data_encoded = pd.DataFrame()
for i in cat_vars:
    label = LabelEncoder()
    data_encoded[i] = label.fit_transform(data[i])
    label_encoders[i] = label

data_encoded.set_index(data.index, inplace=True)
data_encoded = pd.concat([data.loc[:, cont_vars], data_encoded], axis=1).copy()



In [35]:
# prepare the training/testing/validation datasets

# separate into train/validate/test dataset of sizes 80%/10%/10% as percetages of the initial data
data_remaining, test = train_test_split(data_encoded, test_size=0.1)
train, valid = train_test_split(data_remaining, test_size=0.1112)

# because we don't want to train on protected attributes or labels to be predicted, 
# let's remove these columns from the training dataset
protected_train = train['gender'].copy() # gender is a protected attribute
y_train = train['income'].copy() # labels we're going to train the model to predict
x_train = train.drop(columns=['gender','income'])
protected_valid = valid['gender'].copy() 
y_valid = valid['income'].copy() 
x_valid = valid.drop(columns=['gender','income'])
protected_test = test['gender'].copy() 
y_test = test['income'].copy()
x_test = test.drop(columns=['gender','income'])

In [36]:
# train a XGBoost model to predict 'income'

standard_model = XGBClassifier(use_label_encoder=False, eval_metric='logloss', random_state=4)    
model_fit = standard_model.fit(x_train, y_train)

In [37]:
y_train_pred = standard_model.predict(x_train)
y_valid_pred = standard_model.predict(x_valid)
print('Model accuracy on the training dataset :', 
      round(100 * accuracy_score(y_train, y_train_pred),2),'%') # round the score to 2 digits  

print('Model accuracy on the validation dataset :', 
      round(100 * accuracy_score(y_valid, y_valid_pred),2),'%')

Model accuracy on the training dataset : 90.07 %
Model accuracy on the validation dataset : 87.77 %


## Log config, dataset and model to Etiq

For already trained models make sure you only you use a sample you held out. 

As you don't want any retraining of the model to occur, set your train_valid_test split to [0.0, 1.0, 0.0]. 

In [38]:
etiq.load_config("./config_already_trained.json")


{'dataset': {'label': 'income',
  'bias_params': {'protected': 'gender',
   'privileged': 1,
   'unprivileged': 0,
   'positive_outcome_label': 1,
   'negative_outcome_label': 0},
  'train_valid_test_splits': [0.0, 1.0, 0.0],
  'cat_col': 'cat_vars',
  'cont_col': 'cont_vars'},
 'scan_accuracy_metrics': {'thresholds': {'accuracy': [0.8, 1.0],
   'true_pos_rate': [0.6, 1.0],
   'true_neg_rate': [0.6, 1.0]}},
 'scan_bias_metrics': {'thresholds': {'equal_opportunity': [0.0, 0.2],
   'demographic_parity': [0.0, 0.2],
   'equal_odds_tnr': [0.0, 0.2],
   'individual_fairness': [0.0, 0.8],
   'equal_odds_tpr': [0.0, 0.2]}},
 'scan_leakage': {'leakage_threshold': 0.85}}

In [39]:
from etiq import Model


#log your dataset

dataset_loader = etiq.dataset(test)

#Log your already trained model

model = Model(model_architecture=standard_model, model_fitted=model_fit)

In [40]:
snapshot = project.snapshots.create(name="Snapshot 5", dataset=dataset_loader.initial_dataset, model=model, bias_params=dataset_loader.bias_params)


In [41]:
snapshot.scan_accuracy_metrics()

snapshot.scan_bias_metrics()


snapshot.scan_leakage()

INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0779:Starting pipeline
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0779:Computed acurracy metrics for the dataset
INFO:etiq.pipeline.AccuracyMetricsIssuePipeline0779:Completed pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0825:Starting pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0825:Computed bias metrics for the dataset
INFO:etiq.pipeline.BiasMetricsIssuePipeline0825:Completed pipeline
INFO:etiq.pipeline.DataPipeline0327:Starting pipeline
INFO:etiq.pipeline.DataPipeline0327:Computed metrics for the initial dataset
INFO:etiq.pipeline.DataPipeline0327:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0227:Starting pipeline
INFO:etiq.pipeline.DebiasPipeline0227:Start Phase IdentifyFeatureLeakPipeline0943
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0943:Using parent model
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0943:Starting pipeline
INFO:etiq.pipeline.IdentifyFeatureLeakPipeline0943:Completed pipeline
INFO:etiq.pip

(   name     business_rule  \
 0     0  kmeans segment 0   
 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             mask  
 0  [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, Tru

# SNAPSHOT 6, in-production

To run scans in production you can use a similar set-up like in pre-production. 
     - for drift - you will need your current feature/label dataset and a comparison dataset (as per the pre-production example above)
     - for bias - you will need your model and your feature/label dataset that you have scored 
     
At the moment functionality will not allow to record actuals separtely, but we are working on it. This means that to run scans which use the actuals (accuracy, a good chunk of the bias metrics scans, target and concept drift), you will have to create your dataset to include the actuals. To log it to etiq the actual will be the 'label' parameter in your config.
 
This example is just for illustration purposes as you will not be running production scans from a jupyter notebook. 
Etiq can be used with orchestration and model registry tools. Please email us: info@etiq.ai for help with using Etiq with your toolset and for online models. We will be adding demos on how to use Etiq with Airflow and MLflow shortly. 


## Log dataset, comparison dataset, model, config

For the dataset we will use yesterday and today's datasets set-up earlier in the notebook, but any time window will work - depends on your scoring frequency. 

For model: we will use the model we've trained at the previous step. 

For config we will use a config that has scans achievable in production without the actuals

In [42]:
etiq.load_config("./config_production.json")


{'dataset': {'label': 'income',
  'bias_params': {'protected': 'gender',
   'privileged': 1,
   'unprivileged': 0,
   'positive_outcome_label': 1,
   'negative_outcome_label': 0},
  'train_valid_test_splits': [0.0, 1.0, 0.0],
  'cat_col': 'cat_vars',
  'cont_col': 'cont_vars'},
 'scan_bias_metrics': {'thresholds': {'equal_opportunity': [0.0, 0.2],
   'demographic_parity': [0.0, 0.2],
   'equal_odds_tnr': [0.0, 0.2],
   'individual_fairness': [0.0, 0.8],
   'equal_odds_tpr': [0.0, 0.2]}},
 'scan_drift_metrics': {'thresholds': {'psi': [0.0, 0.15],
   'kolmogorov_smirnov': [0.05, 1.0]},
  'drift_measures': ['kolmogorov_smirnov', 'psi']}}

In [43]:


# Create a dataset with the comparison data
#dataset_s = etiq.SimpleDatasetBuilder.from_dataframe(data_encoded, target_feature='income').build()

dataset_loader = etiq.dataset(data_encoded)

# Create a dataset with the data
yesterday_dataset_s = etiq.SimpleDatasetBuilder.from_dataframe(yesterday_dataset_df, target_feature='income').build()


# Use the already trained model from the previous step
from etiq import Model
model = Model(model_architecture=standard_model, model_fitted=model_fit)

# Creating a snapshot, label it as PRODUCTION (snapshots are labelled Pre-Production) by default
from etiq import SnapshotStage
snapshot = project.snapshots.create(name="Snapshot 6", dataset=dataset_loader.initial_dataset, comparison_dataset=yesterday_dataset_s, model=model, bias_params=dataset_loader.bias_params, stage=SnapshotStage.PRODUCTION)



In [44]:
snapshot.scan_bias_metrics()



INFO:etiq.pipeline.BiasMetricsIssuePipeline0717:Starting pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0717:Computed bias metrics for the dataset
INFO:etiq.pipeline.BiasMetricsIssuePipeline0717:Completed pipeline


(  name business_rule mask
 0  all           all   [],
 Empty DataFrame
 Columns: []
 Index: [],
                                   name  \
 0   demographic_parity_below_threshold   
 1   demographic_parity_above_threshold   
 2       equal_odds_tpr_below_threshold   
 3       equal_odds_tpr_above_threshold   
 4       equal_odds_tnr_below_threshold   
 5       equal_odds_tnr_above_threshold   
 6    equal_opportunity_below_threshold   
 7    equal_opportunity_above_threshold   
 8  individual_fairness_below_threshold   
 9  individual_fairness_above_threshold   
 
                                              metric measure features segments  \
 0   <function demographic_parity at 0x7f7e4b1e8310>    None       {}       {}   
 1   <function demographic_parity at 0x7f7e4b1e8310>    None       {}       {}   
 2       <function equal_odds_tpr at 0x7f7e4b1e83a0>    None       {}       {}   
 3       <function equal_odds_tpr at 0x7f7e4b1e83a0>    None       {}       {}   
 4       <function

In [45]:
snapshot.scan_drift_metrics()

INFO:etiq.pipeline.DriftPipeline0399:Starting pipeline
INFO:etiq.pipeline.DriftPipeline0399:Calculated drift measures.
INFO:etiq.pipeline.DriftPipeline0399:Identifying drift measure issues.
INFO:etiq.pipeline.DriftPipeline0399:Completed pipeline


(  name business_rule  \
 0  all           all   
 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             mask  
 0  [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, T