# Notebook Summary 


### Quickstart

  1. Import etiq library - for install please check our docs (https://docs.etiq.ai/) 

  2. Login to the dashboard - this way you can send the results to your dashboard instance (Etiq AWS instance if you use the SaaS version). To deploy on your own cloud instance, get in touch (info@etiq.ai)

  3. Create or open a project 
  
  
### Bias Metrics & Bias Sources Scans 


  4. Load Adult dataset 
  
  5. Load your config file and create your snapshot based on an etiq wrapped xgboost model
  
  6. Scan for bias issues 
  
  7. Scan for bias sources in the training dataset



# What is bias? And why it matters?

In this context, bias refers to algorithmic bias. "Algorithmic bias" refers to unintended discrimination occurring as a result of an automated decision. 

Legislation defines a series of protected features. For example, in the UK, citizens are protected against discrimination on the basis of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex or sexual orientation status by the Equality Act 2010. 

The unprivileged group within the protected feature (for example, people over 65 when age is the protected feature) tends to be discriminated against and as a result tends to be the one protected by legislation. The privileged group within the protected feature tends to not be discriminated against.

If you are not tackling this issue, not only is your model potentially unethical, discriminating unintentionally and at risk from a compliance point of view, but also you are potentially leaving customer groups underserved and thus leaving money on the table. 



# SET-UP

In [1]:
import etiq


Thanks for trying out the ETIQ.ai toolkit!

Visit our getting started documentation at https://docs.etiq.ai/

Visit our Slack channel at https://etiqcore.slack.com/ for support or feedback.



In [2]:
from etiq import login as etiq_login
etiq_login("https://dashboard.etiq.ai/", "jutJ2bU1VP5bTWKRlc7ZMlMLZQ8HZCH1Jf2QvBkWqikUt9LiD5SCSfFRMHcze3hY")


(Dashboard supplied updated license information)


'Connection successful. Projects and pipelines will be displayed in the dashboard. 😀'

In [3]:
# Can get/create a single named project
project = etiq.projects.open(name="Bias Scan")

# SNAPSHOT: xgboost, pre-configured model


To illustrate some of the library's features, we build a model that predicts whether an applicant makes over or under 50K using the Adult dataset from https://archive.ics.uci.edu/ml/datasets/adult.


First, we'll be encoding the categorical features found in this dataset.

Second, we'll log the dataset to Etiq.

In this case we encode prior to splitting into test/train/validate because we know in advance the categories people fall into for this dataset. This means that in production we won't run into new categories that will fall into a bucket not included in this dataset, This allows us to encode prior to splitting into train/test/validation.

However if this is not the case for your use case, you should NOT encode prior to splitting your sample, as this might lead to LEAKAGE.

Encoding categorical values itself is problematic as it assigns a numerical ranking to categorical variables. For best practice encoding use one hot encoding. As we limit the free library functionality to 15 features, we will not do one-hot encoding for the purposes of this example.

Remember: This is an example only. The use case for the majority of scans in Etiq is that you log the model to Etiq once you have the sample that you'll be training on. Usually this sample will have numeric features only as otherwise you will not be able to use it in with the majority of supported libraries training methods.

In [4]:
# Loading a dataset. We're using the adult dataset
data = etiq.utils.load_sample("adultdata.csv")
data.head()


RecursionError: maximum recursion depth exceeded while calling a Python object

In [5]:
from etiq.transforms import LabelEncoder
import pandas as pd
import numpy as np 

# use a LabelEncoder to transform categorical variables
cont_vars = ['age', 'educational-num', 'fnlwgt', 'capital-gain', 'capital-loss', 'hours-per-week']
cat_vars = list(set(data.columns.values) - set(cont_vars))

label_encoders = {}
data_encoded = pd.DataFrame()
for i in cat_vars:
    label = LabelEncoder()
    data_encoded[i] = label.fit_transform(data[i])
    label_encoders[i] = label

data_encoded.set_index(data.index, inplace=True)
data_encoded = pd.concat([data.loc[:, cont_vars], data_encoded], axis=1).copy()


## Loading the Config File and Logging the Snapshot to Etiq 

This can happen at any point in the pipeline and through a variety of ways

In [6]:
from etiq.model import DefaultXGBoostClassifier

with etiq.etiq_config("./config_bias_scans.json"):
    #load your dataset

    dataset = etiq.BiasDatasetBuilder.dataset(data_encoded)

    #Log your already trained model
    model = DefaultXGBoostClassifier()

    # Creating a snapshot
    snapshot = project.snapshots.create(name="Test Bias", 
                                        dataset=dataset, 
                                        model=model, 
                                        bias_params=etiq.biasparams.BiasParams(protected='gender', privileged=1, unprivileged=0, positive_outcome_label=1, negative_outcome_label=0))
    
    #bias metrics scan
    (segments, issues, issue_summary) = snapshot.scan_bias_metrics()
    

INFO:etiq.pipeline.BiasMetricsIssuePipeline0057:Starting pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0057:Computed bias metrics for the dataset
INFO:etiq.pipeline.BiasMetricsIssuePipeline0057:Completed pipeline


## Bias Metrics Scan

Some of the metrics commonly used in the algorithmic fairness literature that the Etiq library provides are:

- equal_opportunity metric: measures the difference in true positive rate between a privileged demographic group and an unprivileged demographic group. 

- demographic_parity (the difference between number of positive labels out of total from a privileged demographic group vs. a unprivileged demographic group)

- equal_odds_tpr & equal_odds_tnr (unlike with equal_opportunity, this criteria looks at difference between true positive rate - privileged vs. unpriviledge and true negative rate - privileged vs. unprivileged, with the aim of ensuring that the difference for both metrics are minimal)

- individual_fairness (measures whether individuals with similar features observe the same model responses)

Our Bias Metrics scan uses the metrics above with certain thresholds to see if the model meets that benchmark or not. 

The thresholds are set by the user, BUT most metrics are ideally as close to 0 as possible, meaning that the model shouldn't really behave differently (and with detrimental outcomes) for the protected groups. 

The consensus in the literature (and our view) is that algorithmic bias can be mitigated but not removed entirely.


In [7]:
issue_summary

Unnamed: 0,name,metric,measure,features,segments,total_issues_tested,issues_found,threshold
0,demographic_parity_below_threshold,<compiled_function demographic_parity at 0x7fb...,,{},{},1,0,"[0.0, 0.2]"
1,demographic_parity_above_threshold,<compiled_function demographic_parity at 0x7fb...,,{},{},1,0,"[0.0, 0.2]"
2,equal_odds_tpr_below_threshold,<compiled_function equal_odds_tpr at 0x7fb9163...,,{},{},1,0,"[0.0, 0.2]"
3,equal_odds_tpr_above_threshold,<compiled_function equal_odds_tpr at 0x7fb9163...,,{},{},1,0,"[0.0, 0.2]"
4,equal_odds_tnr_below_threshold,<compiled_function equal_odds_tnr at 0x7fb9163...,,{},{},1,0,"[0.0, 0.2]"
5,equal_odds_tnr_above_threshold,<compiled_function equal_odds_tnr at 0x7fb9163...,,{},{},1,0,"[0.0, 0.2]"
6,equal_opportunity_below_threshold,<compiled_function equal_opportunity at 0x7fb9...,,{},{},1,0,"[0.0, 0.2]"
7,equal_opportunity_above_threshold,<compiled_function equal_opportunity at 0x7fb9...,,{},{},1,0,"[0.0, 0.2]"
8,individual_fairness_below_threshold,<compiled_function individual_fairness at 0x7f...,,{},{},1,0,"[0.0, 0.2]"
9,individual_fairness_above_threshold,<compiled_function individual_fairness at 0x7f...,,{},{},1,0,"[0.0, 0.2]"


## Bias Sources 



Our Bias Sources scan identifies potential sources of bias based on a framework that includes: 

- proxies - identifying features
- sample size disparity - difference in sample sizes and size of positive/negative labels between protected demographic and the majority demographic group
- segment size - are some customer profiles poorly represented in your sample
- limited features/correlation issue - features are less reliable for a certain demographic group, which is oftentimes linked with sampling but more fundamentally it could be that some groups' behaviour is less well encoded by available features


It can useful to look at these metrics globally to uncover issues across your sample. However, a lot of the issues will only be visible for specific groups, specific records. The Bias Sources scan aims to identify which groups have the issues above. 


Bias sources scan is ran on training dataset by default as this is where the potentially harmful unfairly discriminatory pattern is learned by your model. 

You have two options of bias sources scans to run: 

1) if you don't set anything in the config, the segments will be fuzzy rather than business rules. 

2) if you set the option: auto in the config (as in the current config we are using) then the segments will be based on business rules.

If you use the auto option, you will need to specify the categorical and continuous features. You can do this either from the config as in this case or from the notebook (see last cell). 


We provide multiple correlation measures to be used based on the type of features: Pearson, Cramer's V, Rank-Biserial, Point-Biserial. Remember to clarify in the config or the snapshot which features are of which type to be able to use fully the multiple measure functionality. You can customize this in the config, but the default and recommended version is below:

- "continuous_continuous_measure"  :  "pearsons"
- "categorical_categorical_measure": "cramersv" 
- "categorical_continuous_measure": "rankbiserial"
- "binary_continuous_measure": "pointbiserial"

In [8]:
(segments, issues, issue_summary) = snapshot.scan_bias_sources()

INFO:etiq.pipeline.DataPipeline0365:Starting pipeline
INFO:etiq.pipeline.DataPipeline0365:Computed metrics for the initial dataset
INFO:etiq.pipeline.DataPipeline0365:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0932:Starting pipeline
INFO:etiq.pipeline.DebiasPipeline0932:Start Phase IdentifyPipeline0490
INFO:etiq.pipeline.IdentifyPipeline0490:Using parent model
INFO:etiq.pipeline.IdentifyPipeline0490:Starting pipeline


  pk = 1.0*pk / np.sum(pk, axis=axis, keepdims=True)
  c /= stddev[:, None]
  c /= stddev[None, :]
  c /= stddev[:, None]
  c /= stddev[None, :]
  c /= stddev[:, None]
  c /= stddev[None, :]
  c /= stddev[:, None]
  c /= stddev[None, :]
  c /= stddev[:, None]
  c /= stddev[None, :]
  c /= stddev[:, None]
  c /= stddev[None, :]


INFO:etiq.pipeline.IdentifyPipeline0490:Completed pipeline
INFO:etiq.pipeline.DebiasPipeline0932:Completed Phase IdentifyPipeline0490
INFO:etiq.pipeline.DebiasPipeline0932:Computed metrics for the initial dataset
INFO:etiq.pipeline.DebiasPipeline0932:Completed pipeline




In [9]:
issues

Unnamed: 0,name,feature,segment,measure,measure_value,metric,metric_value,threshold
0,low_volume_group,,4.0,,,,,"(1000, inf)"
1,low_volume_group,,7.0,,,,,"(1000, inf)"
2,low_volume_group,,9.0,,,,,"(1000, inf)"
3,low_volume_group,,10.0,,,,,"(1000, inf)"
4,low_volume_group,,12.0,,,,,"(1000, inf)"
...,...,...,...,...,...,...,...,...
90,correlation_issue,occupation,14.0,<compiled_function cramersv at 0x7fb915eb3240>,0.213495,,,"(0.0, 0.2)"
91,correlation_issue,native-country,14.0,<compiled_function cramersv at 0x7fb915eb3240>,,,,"(0.0, 0.2)"
92,correlation_issue,education,14.0,<compiled_function cramersv at 0x7fb915eb3240>,0.248230,,,"(0.0, 0.2)"
93,limited_features_issue,,16.0,,,<compiled_function equal_opportunity at 0x7fb9...,0.226986,"(0.0, 0.2)"


In [10]:
issue_summary

Unnamed: 0,name,metric,measure,features,segments,total_issues_tested,issues_found,threshold
0,missing_sample,,,{},"{10, 7}",20,2,"(0.0, 0.0)"
1,low_unpriv_sample,,,{},{},18,0,"(0.0, 0.8)"
2,low_priv_sample,,,{},{},18,0,"(0.0, 0.8)"
3,skewed_priv_sample,,,{},{},13,0,"(0.0, 0.2)"
4,skewed_unpriv_sample,,,{},"{0, 1, 5, 6, 9}",18,5,"(0.0, 0.2)"
5,low_volume_group,,,{},"{4, 7, 9, 10, 12, 14, 19}",20,7,"(1000, inf)"
6,limited_features_issue,<compiled_function equal_opportunity at 0x7fb9...,,{},"{16.0, 19.0}",20,2,"(0.0, 0.2)"
7,proxy_issue,,<compiled_function pointbiserial at 0x7fb915eb...,{educational-num},{7},120,1,"(0.0, 1.0)"
8,proxy_issue,,<compiled_function cramersv at 0x7fb915eb3240>,"{marital-status, race, relationship, occupatio...","{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...",140,29,"(0.0, 1.0)"
9,correlation_issue,,<compiled_function pointbiserial at 0x7fb915eb...,"{capital-gain, capital-loss, age, fnlwgt, hour...","{0, 4, 7, 10, 12, 14}",120,25,"(0.0, 1.0)"


In [11]:
pd.set_option('display.max_colwidth', None)

In [12]:
segments

Unnamed: 0,name,business_rule,mask
0,0,kmeans segment 0,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
1,1,kmeans segment 1,"[False, False, True, False, False, False, True, False, False, False, False, False, False, False, False, False, True, False, True, False, False, False, False, False, False, True, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, True, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, ...]"
2,2,kmeans segment 2,"[False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, ...]"
3,3,kmeans segment 3,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
4,4,kmeans segment 4,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
5,5,kmeans segment 5,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, True, False, True, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, True, False, True, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, True, False, False, False, True, False, False, False, False, False, True, True, True, False, False, False, ...]"
6,6,kmeans segment 6,"[True, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, ...]"
7,7,kmeans segment 7,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
8,8,kmeans segment 8,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, True, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"
9,9,kmeans segment 9,"[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ...]"


 You can specify categorical and continuous features also directly in the notebook as per example below:

In [13]:
from etiq.model import DefaultXGBoostClassifier

with etiq.etiq_config("./config_bias_scans.json"):
    #load your dataset

    dataset = etiq.BiasDatasetBuilder.dataset(data_encoded)
    dl = etiq.dataset_loader.DatasetLoader(data=data_encoded, label='income', bias_params=etiq.biasparams.BiasParams(protected='gender', privileged=1, unprivileged=0, positive_outcome_label=1, negative_outcome_label=0),
                   train_valid_test_splits=[0.8, 0.1, 0.1], cat_col=cat_vars,
                   cont_col=cont_vars, names_col = data_encoded.columns.values)

    #Log your already trained model
    model = DefaultXGBoostClassifier()

    # Creating a snapshot
    snapshot = project.snapshots.create(name="Snapshot 2", 
                                        dataset=dl.initial_dataset, 
                                        model=model, 
                                        bias_params=etiq.biasparams.BiasParams(protected='gender', privileged=1, unprivileged=0, positive_outcome_label=1, negative_outcome_label=0))
    
    #bias metrics scan
    (segments, issues, issue_summary) = snapshot.scan_bias_metrics()
    

INFO:etiq.pipeline.BiasMetricsIssuePipeline0036:Starting pipeline
INFO:etiq.pipeline.BiasMetricsIssuePipeline0036:Computed bias metrics for the dataset
INFO:etiq.pipeline.BiasMetricsIssuePipeline0036:Completed pipeline
