Before beginning task 3, make sure to run the following cell to import all necessary packages. If you need any additional packages, add the import statement(s) to the cell below and re-run the cell before adding and running code that uses the additional packages.


In [2]:
# Load all necessary packages
import numpy as np
import sklearn as skl
import six
import tensorflow as tf

# dataset
from aif360.datasets import AdultDataset

# metrics
from fklearn.metric_library import UnifiedMetricLibrary, classifier_quality_score

# models
from fklearn.scikit_learn_wrapper import LogisticRegression, KNeighborsClassifier, RandomForestClassifier, SVC
from aif360.algorithms.inprocessing import AdversarialDebiasing

# pre/post-processing algorithms
from aif360.algorithms.preprocessing import DisparateImpactRemover, Reweighing
from aif360.algorithms.postprocessing import CalibratedEqOddsPostprocessing, RejectOptionClassification

from fklearn.fair_selection_aif import ModelSearch, DEFAULT_ADB_PARAMS

# Tutorial 3: fairkit-learn

Lastly, we show you how to train and evaluate models using fairkit-learn. You will use the knowledge from this tutorial to complete Task 3, so please read thoroughly and execute the code cells in order.

## Step 1: Import the dataset

First we need to import the dataset we will use for training and testing our model.

Below, we provide code that imports the Adult census dataset. **Note: a warning may pop up when you run this cell. As long as you don't see any errors in the code, it is fine to continue.**


In [3]:
data_orig = AdultDataset()



## Step 2: Set protected attributes

To use the grid search functionality provided by fairkit-learn, we again need to specify the privileged and unprivileged (protected) attributes. 

Below we provide code that stores the protected attributes (*race* is 0 for "Non-white", *sex* is 0 for "Female").

In [4]:
unprivileged = [{'race': 0, 'sex': 0}]
privileged = [{'race': 1, 'sex': 1}]

## Step 3: Specify parameters for grid search

Now we need to specify the various parameters required for the grid search provided by fairkit-learn. Each search parameter is a dictionary of options to include in the search. For each search parameter, you can input one or multiple options to consider. 

Below we provide code that sets parameters for a simple grid search across different hyper-parameter values for the Logistic Regression model, with and without the specified pre-/post-processing algorithms. We specify all performance and fairness metrics for the search -- given the way the classifier quality score is calculated, this cannot be added to the grid search and will be calculated later.

In [5]:
# we use one model here
models = {'LogisticRegression': LogisticRegression, 'RandomForestClassifier': RandomForestClassifier,
          'AdversarialDebiasing': AdversarialDebiasing}

# here we add all the metrics we want to evaluate on (performance and fairness)
metrics = {'UnifiedMetricLibrary': [UnifiedMetricLibrary,
                                    'accuracy_score',
                                    'average_odds_difference',
                                    'statistical_parity_difference',
                                    'equal_opportunity_difference',
                                    'disparate_impact'
                                   ]
          }

# Hyperparameters may either be specified as a dictionary of string to lists, or by an empty dictionary to
# use the default ones set by sklearn (or AIF360). The keys are the names of the hyperparameters, and the
# values and lists of possible values to form a grid search over (example shown with LogisticRegression)

# For the AdversarialDebiasing classifier, you would specify hyperparameters using the following dictionary
# entry:
# 'AdversarialDebiasing' : DEFAULT_ADB_PARAMS(unprivileged=unprivileged, privileged=privileged)

hyperparameters = {'LogisticRegression':{'penalty': ['l1', 'l2'], 'C': [0.1, 0.5, 1]},
                   'RandomForestClassifier':{},
                  'AdversarialDebiasing': DEFAULT_ADB_PARAMS(unprivileged=unprivileged, privileged=privileged)                  }

# this parameter is needed for the search and does not need to be modified
thresholds = [i * 10.0/100 for i in range(5)]

# Specify pre/post-processors as a list of initialized AIF360 pre/post-processing instances; 
# you can also run without any pre/post-processing algorithms (empty list)
processor_args = {'unprivileged_groups': unprivileged, 'privileged_groups': privileged}

# Options: DisparateImpactRemover(), Reweighing(unprivileged_groups=unprivileged,privileged_groups=privileged), or both
preprocessors=[DisparateImpactRemover(), Reweighing(**processor_args)]
# Options: CalibratedEqOddsPostprocessing(unprivileged_groups=unprivileged,privileged_groups=privileged), RejectOptionClassification(unprivileged_groups=unprivileged,privileged_groups=privileged), or both
postprocessors=[CalibratedEqOddsPostprocessing(**processor_args), RejectOptionClassification(**processor_args)]


## Step 4: Run the grid search

Now that we've set all the parameters necessary for the grid search, we're ready to run it. The output of the grid search is saved to a .csv file.

Below we provide code that creates and uses the `ModelSearch` object to run a grid search over the parameters we specified and saves the output to a .csv file in the specified directory.  **The search take a while to complete. Wait until the search completes before attempting to execute more cells.**

**Note: warnings may appear during search, however, as long as you don’t see any code errors it is fine to continue.** 


In [None]:
Search = ModelSearch(models, metrics, hyperparameters, thresholds)
Search.grid_search(data_orig, privileged=privileged, unprivileged=unprivileged, preprocessors=preprocessors, postprocessors=postprocessors)

Search.to_csv("fklearn/interface/static/data/test-file.csv")


















For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

Instructions for updating:
Colocations handled automatically by placer.


Instructions for updating:
Colocations handled automatically by placer.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


epoch 0; iter: 0; batch classifier loss: 54.887466; batch adversarial loss: 1.003274
epoch 0; iter: 200; batch classifier loss: 5.715112; batch adversarial loss: 0.646678
epoch 1; iter: 0; batch classifier loss: 10.366486; batch adversarial loss: 0.655044
epoch 1; iter: 200; batch classifier loss: 4.235228; batch adversarial loss: 0.635479
epoch 2; iter: 0; batch classifier loss: 5.100244; batch adversarial loss: 0.571705
epoch 2; iter: 200; batch classifier loss: 2.136794; batch adversarial loss: 0.533594
epoch 3; iter: 0; batch classifier loss: 0.483770; batch adversarial loss: 0.520919
epoch 3; iter: 200; batch classifier loss: 1.424919; batch adversarial loss: 0.469341
epoch 4; iter: 0; batch classifier loss: 1.732073; batch adversarial loss: 0.497323
epoch 4; iter: 200; batch classifier loss: 0.547906; batch adversarial loss: 0.438416
epoch 5; iter: 0; batch classifier loss: 0.655766; batch adversarial loss: 0.454126
epoch 5; iter: 200; batch classifier loss: 1.014563; batch adver

epoch 48; iter: 200; batch classifier loss: 0.385514; batch adversarial loss: 0.316370
epoch 49; iter: 0; batch classifier loss: 0.437600; batch adversarial loss: 0.348870
epoch 49; iter: 200; batch classifier loss: 0.435185; batch adversarial loss: 0.394058
epoch 0; iter: 0; batch classifier loss: 70.984344; batch adversarial loss: 0.830426
epoch 0; iter: 200; batch classifier loss: 22.551861; batch adversarial loss: 0.701002
epoch 1; iter: 0; batch classifier loss: 7.257968; batch adversarial loss: 0.695446
epoch 1; iter: 200; batch classifier loss: 3.941057; batch adversarial loss: 0.565313
epoch 2; iter: 0; batch classifier loss: 1.924273; batch adversarial loss: 0.553151
epoch 2; iter: 200; batch classifier loss: 0.942435; batch adversarial loss: 0.508871
epoch 3; iter: 0; batch classifier loss: 1.816519; batch adversarial loss: 0.527285
epoch 3; iter: 200; batch classifier loss: 1.674798; batch adversarial loss: 0.465521
epoch 4; iter: 0; batch classifier loss: 1.291459; batch ad

epoch 46; iter: 200; batch classifier loss: 0.257916; batch adversarial loss: 0.469531
epoch 47; iter: 0; batch classifier loss: 0.345392; batch adversarial loss: 0.299593
epoch 47; iter: 200; batch classifier loss: 0.312110; batch adversarial loss: 0.371880
epoch 48; iter: 0; batch classifier loss: 0.406908; batch adversarial loss: 0.377222
epoch 48; iter: 200; batch classifier loss: 0.266886; batch adversarial loss: 0.396561
epoch 49; iter: 0; batch classifier loss: 0.308528; batch adversarial loss: 0.413882
epoch 49; iter: 200; batch classifier loss: 0.469918; batch adversarial loss: 0.288544
epoch 0; iter: 0; batch classifier loss: 27.580425; batch adversarial loss: 0.836468
epoch 0; iter: 200; batch classifier loss: 4.694711; batch adversarial loss: 0.626460
epoch 1; iter: 0; batch classifier loss: 7.703262; batch adversarial loss: 0.633186
epoch 1; iter: 200; batch classifier loss: 1.808448; batch adversarial loss: 0.565212
epoch 2; iter: 0; batch classifier loss: 1.942724; batch

epoch 44; iter: 200; batch classifier loss: 0.278117; batch adversarial loss: 0.440354
epoch 45; iter: 0; batch classifier loss: 0.382031; batch adversarial loss: 0.304193
epoch 45; iter: 200; batch classifier loss: 0.402520; batch adversarial loss: 0.479109
epoch 46; iter: 0; batch classifier loss: 0.377590; batch adversarial loss: 0.367668
epoch 46; iter: 200; batch classifier loss: 0.295521; batch adversarial loss: 0.368840
epoch 47; iter: 0; batch classifier loss: 0.346250; batch adversarial loss: 0.361970
epoch 47; iter: 200; batch classifier loss: 0.312633; batch adversarial loss: 0.438291
epoch 48; iter: 0; batch classifier loss: 0.404884; batch adversarial loss: 0.306957
epoch 48; iter: 200; batch classifier loss: 0.243585; batch adversarial loss: 0.546587
epoch 49; iter: 0; batch classifier loss: 0.328229; batch adversarial loss: 0.419319
epoch 49; iter: 200; batch classifier loss: 0.328237; batch adversarial loss: 0.447410
epoch 0; iter: 0; batch classifier loss: 21.507036; b

epoch 42; iter: 200; batch classifier loss: 0.332184; batch adversarial loss: 0.481640
epoch 43; iter: 0; batch classifier loss: 0.298377; batch adversarial loss: 0.535090
epoch 43; iter: 200; batch classifier loss: 0.283814; batch adversarial loss: 0.420943
epoch 44; iter: 0; batch classifier loss: 0.382865; batch adversarial loss: 0.395089
epoch 44; iter: 200; batch classifier loss: 0.319452; batch adversarial loss: 0.473344
epoch 45; iter: 0; batch classifier loss: 0.387393; batch adversarial loss: 0.343189
epoch 45; iter: 200; batch classifier loss: 0.367107; batch adversarial loss: 0.396017
epoch 46; iter: 0; batch classifier loss: 0.430934; batch adversarial loss: 0.403915
epoch 46; iter: 200; batch classifier loss: 0.350209; batch adversarial loss: 0.496324
epoch 47; iter: 0; batch classifier loss: 0.251705; batch adversarial loss: 0.399672
epoch 47; iter: 200; batch classifier loss: 0.551083; batch adversarial loss: 0.342619
epoch 48; iter: 0; batch classifier loss: 0.335256; b

epoch 40; iter: 200; batch classifier loss: 0.403403; batch adversarial loss: 0.453533
epoch 41; iter: 0; batch classifier loss: 0.272603; batch adversarial loss: 0.403425
epoch 41; iter: 200; batch classifier loss: 0.319511; batch adversarial loss: 0.441964
epoch 42; iter: 0; batch classifier loss: 0.221246; batch adversarial loss: 0.359044
epoch 42; iter: 200; batch classifier loss: 0.304838; batch adversarial loss: 0.459358
epoch 43; iter: 0; batch classifier loss: 0.280953; batch adversarial loss: 0.404912
epoch 43; iter: 200; batch classifier loss: 0.340718; batch adversarial loss: 0.403002
epoch 44; iter: 0; batch classifier loss: 0.277612; batch adversarial loss: 0.563569
epoch 44; iter: 200; batch classifier loss: 0.298712; batch adversarial loss: 0.360838
epoch 45; iter: 0; batch classifier loss: 0.233290; batch adversarial loss: 0.381682
epoch 45; iter: 200; batch classifier loss: 0.314817; batch adversarial loss: 0.358644
epoch 46; iter: 0; batch classifier loss: 0.314670; b

epoch 38; iter: 200; batch classifier loss: 0.312181; batch adversarial loss: 0.393121
epoch 39; iter: 0; batch classifier loss: 0.367808; batch adversarial loss: 0.474631
epoch 39; iter: 200; batch classifier loss: 0.366374; batch adversarial loss: 0.329715
epoch 40; iter: 0; batch classifier loss: 0.323514; batch adversarial loss: 0.426982
epoch 40; iter: 200; batch classifier loss: 0.309031; batch adversarial loss: 0.442916
epoch 41; iter: 0; batch classifier loss: 0.278254; batch adversarial loss: 0.401226
epoch 41; iter: 200; batch classifier loss: 0.354041; batch adversarial loss: 0.473871
epoch 42; iter: 0; batch classifier loss: 0.425127; batch adversarial loss: 0.378339
epoch 42; iter: 200; batch classifier loss: 0.320507; batch adversarial loss: 0.342621
epoch 43; iter: 0; batch classifier loss: 0.392442; batch adversarial loss: 0.391229
epoch 43; iter: 200; batch classifier loss: 0.336535; batch adversarial loss: 0.430799
epoch 44; iter: 0; batch classifier loss: 0.270947; b

epoch 36; iter: 200; batch classifier loss: 0.448285; batch adversarial loss: 0.481588
epoch 37; iter: 0; batch classifier loss: 0.370110; batch adversarial loss: 0.377217
epoch 37; iter: 200; batch classifier loss: 0.332254; batch adversarial loss: 0.478857
epoch 38; iter: 0; batch classifier loss: 0.251518; batch adversarial loss: 0.374625
epoch 38; iter: 200; batch classifier loss: 0.316278; batch adversarial loss: 0.404686
epoch 39; iter: 0; batch classifier loss: 0.324834; batch adversarial loss: 0.478584
epoch 39; iter: 200; batch classifier loss: 0.355289; batch adversarial loss: 0.441876
epoch 40; iter: 0; batch classifier loss: 0.356229; batch adversarial loss: 0.441092
epoch 40; iter: 200; batch classifier loss: 0.373719; batch adversarial loss: 0.346624
epoch 41; iter: 0; batch classifier loss: 0.346026; batch adversarial loss: 0.375467
epoch 41; iter: 200; batch classifier loss: 0.338357; batch adversarial loss: 0.387088
epoch 42; iter: 0; batch classifier loss: 0.388183; b

epoch 34; iter: 200; batch classifier loss: 0.336631; batch adversarial loss: 0.383857
epoch 35; iter: 0; batch classifier loss: 0.315961; batch adversarial loss: 0.424322
epoch 35; iter: 200; batch classifier loss: 0.274718; batch adversarial loss: 0.496410
epoch 36; iter: 0; batch classifier loss: 0.406298; batch adversarial loss: 0.450582
epoch 36; iter: 200; batch classifier loss: 0.313378; batch adversarial loss: 0.489368
epoch 37; iter: 0; batch classifier loss: 0.373967; batch adversarial loss: 0.445503
epoch 37; iter: 200; batch classifier loss: 0.243567; batch adversarial loss: 0.403293
epoch 38; iter: 0; batch classifier loss: 0.307150; batch adversarial loss: 0.444825
epoch 38; iter: 200; batch classifier loss: 0.292418; batch adversarial loss: 0.422754
epoch 39; iter: 0; batch classifier loss: 0.255698; batch adversarial loss: 0.403091
epoch 39; iter: 200; batch classifier loss: 0.364515; batch adversarial loss: 0.398507
epoch 40; iter: 0; batch classifier loss: 0.340260; b

epoch 32; iter: 200; batch classifier loss: 0.360633; batch adversarial loss: 0.344820
epoch 33; iter: 0; batch classifier loss: 0.330670; batch adversarial loss: 0.403838
epoch 33; iter: 200; batch classifier loss: 0.359642; batch adversarial loss: 0.372957
epoch 34; iter: 0; batch classifier loss: 0.585198; batch adversarial loss: 0.447678
epoch 34; iter: 200; batch classifier loss: 0.396576; batch adversarial loss: 0.471619
epoch 35; iter: 0; batch classifier loss: 0.349152; batch adversarial loss: 0.303245
epoch 35; iter: 200; batch classifier loss: 0.336125; batch adversarial loss: 0.424628
epoch 36; iter: 0; batch classifier loss: 0.345984; batch adversarial loss: 0.329270
epoch 36; iter: 200; batch classifier loss: 0.213555; batch adversarial loss: 0.396646
epoch 37; iter: 0; batch classifier loss: 0.280266; batch adversarial loss: 0.457199
epoch 37; iter: 200; batch classifier loss: 0.349977; batch adversarial loss: 0.358116
epoch 38; iter: 0; batch classifier loss: 0.359519; b

epoch 30; iter: 200; batch classifier loss: 0.412101; batch adversarial loss: 0.373055
epoch 31; iter: 0; batch classifier loss: 0.443919; batch adversarial loss: 0.392605
epoch 31; iter: 200; batch classifier loss: 0.331053; batch adversarial loss: 0.298546
epoch 32; iter: 0; batch classifier loss: 0.312295; batch adversarial loss: 0.304579
epoch 32; iter: 200; batch classifier loss: 0.204957; batch adversarial loss: 0.384362
epoch 33; iter: 0; batch classifier loss: 0.502116; batch adversarial loss: 0.375610
epoch 33; iter: 200; batch classifier loss: 0.334160; batch adversarial loss: 0.391896
epoch 34; iter: 0; batch classifier loss: 0.311633; batch adversarial loss: 0.367239
epoch 34; iter: 200; batch classifier loss: 0.316047; batch adversarial loss: 0.360655
epoch 35; iter: 0; batch classifier loss: 0.390426; batch adversarial loss: 0.397339
epoch 35; iter: 200; batch classifier loss: 0.570494; batch adversarial loss: 0.422375
epoch 36; iter: 0; batch classifier loss: 0.294959; b

epoch 28; iter: 200; batch classifier loss: 0.302446; batch adversarial loss: 0.441855
epoch 29; iter: 0; batch classifier loss: 0.376183; batch adversarial loss: 0.335079
epoch 29; iter: 200; batch classifier loss: 0.327816; batch adversarial loss: 0.401316
epoch 30; iter: 0; batch classifier loss: 0.329577; batch adversarial loss: 0.335463
epoch 30; iter: 200; batch classifier loss: 0.384980; batch adversarial loss: 0.453954
epoch 31; iter: 0; batch classifier loss: 0.326987; batch adversarial loss: 0.368274
epoch 31; iter: 200; batch classifier loss: 0.317283; batch adversarial loss: 0.387729
epoch 32; iter: 0; batch classifier loss: 0.361964; batch adversarial loss: 0.412229
epoch 32; iter: 200; batch classifier loss: 0.299572; batch adversarial loss: 0.368095
epoch 33; iter: 0; batch classifier loss: 0.323822; batch adversarial loss: 0.418746
epoch 33; iter: 200; batch classifier loss: 0.409539; batch adversarial loss: 0.416404
epoch 34; iter: 0; batch classifier loss: 0.316876; b

epoch 26; iter: 200; batch classifier loss: 0.381937; batch adversarial loss: 0.366852
epoch 27; iter: 0; batch classifier loss: 0.339671; batch adversarial loss: 0.417550
epoch 27; iter: 200; batch classifier loss: 0.391610; batch adversarial loss: 0.380335
epoch 28; iter: 0; batch classifier loss: 0.286657; batch adversarial loss: 0.427847
epoch 28; iter: 200; batch classifier loss: 0.244728; batch adversarial loss: 0.413857
epoch 29; iter: 0; batch classifier loss: 0.303979; batch adversarial loss: 0.412742
epoch 29; iter: 200; batch classifier loss: 0.306376; batch adversarial loss: 0.346119
epoch 30; iter: 0; batch classifier loss: 0.388609; batch adversarial loss: 0.380775
epoch 30; iter: 200; batch classifier loss: 0.300800; batch adversarial loss: 0.302238
epoch 31; iter: 0; batch classifier loss: 0.365673; batch adversarial loss: 0.338355
epoch 31; iter: 200; batch classifier loss: 0.336274; batch adversarial loss: 0.316433
epoch 32; iter: 0; batch classifier loss: 0.364259; b

epoch 24; iter: 200; batch classifier loss: 0.372163; batch adversarial loss: 0.373057
epoch 25; iter: 0; batch classifier loss: 0.300745; batch adversarial loss: 0.407584
epoch 25; iter: 200; batch classifier loss: 0.391823; batch adversarial loss: 0.359154
epoch 26; iter: 0; batch classifier loss: 0.346513; batch adversarial loss: 0.331726
epoch 26; iter: 200; batch classifier loss: 0.297764; batch adversarial loss: 0.321212
epoch 27; iter: 0; batch classifier loss: 0.321419; batch adversarial loss: 0.351992
epoch 27; iter: 200; batch classifier loss: 0.356215; batch adversarial loss: 0.468239
epoch 28; iter: 0; batch classifier loss: 0.327121; batch adversarial loss: 0.460271
epoch 28; iter: 200; batch classifier loss: 0.417696; batch adversarial loss: 0.392631
epoch 29; iter: 0; batch classifier loss: 0.351030; batch adversarial loss: 0.482054
epoch 29; iter: 200; batch classifier loss: 0.332655; batch adversarial loss: 0.421476
epoch 30; iter: 0; batch classifier loss: 0.362762; b

epoch 22; iter: 200; batch classifier loss: 0.348315; batch adversarial loss: 0.365352
epoch 23; iter: 0; batch classifier loss: 0.413303; batch adversarial loss: 0.327188
epoch 23; iter: 200; batch classifier loss: 0.348925; batch adversarial loss: 0.388215
epoch 24; iter: 0; batch classifier loss: 0.382804; batch adversarial loss: 0.374917
epoch 24; iter: 200; batch classifier loss: 0.320111; batch adversarial loss: 0.407683
epoch 25; iter: 0; batch classifier loss: 0.266820; batch adversarial loss: 0.325402
epoch 25; iter: 200; batch classifier loss: 0.460573; batch adversarial loss: 0.347834
epoch 26; iter: 0; batch classifier loss: 0.327508; batch adversarial loss: 0.482790
epoch 26; iter: 200; batch classifier loss: 0.276309; batch adversarial loss: 0.430720
epoch 27; iter: 0; batch classifier loss: 0.332481; batch adversarial loss: 0.321765
epoch 27; iter: 200; batch classifier loss: 0.310486; batch adversarial loss: 0.405092
epoch 28; iter: 0; batch classifier loss: 0.324484; b

epoch 20; iter: 200; batch classifier loss: 0.260380; batch adversarial loss: 0.368042
epoch 21; iter: 0; batch classifier loss: 0.303252; batch adversarial loss: 0.371612
epoch 21; iter: 200; batch classifier loss: 0.347331; batch adversarial loss: 0.397939
epoch 22; iter: 0; batch classifier loss: 0.362010; batch adversarial loss: 0.255200
epoch 22; iter: 200; batch classifier loss: 0.261810; batch adversarial loss: 0.336895
epoch 23; iter: 0; batch classifier loss: 0.330085; batch adversarial loss: 0.385978
epoch 23; iter: 200; batch classifier loss: 0.360021; batch adversarial loss: 0.495974
epoch 24; iter: 0; batch classifier loss: 0.322548; batch adversarial loss: 0.453225
epoch 24; iter: 200; batch classifier loss: 0.354422; batch adversarial loss: 0.391814
epoch 25; iter: 0; batch classifier loss: 0.309820; batch adversarial loss: 0.472345
epoch 25; iter: 200; batch classifier loss: 0.338895; batch adversarial loss: 0.411409
epoch 26; iter: 0; batch classifier loss: 0.258044; b

epoch 18; iter: 200; batch classifier loss: 0.311985; batch adversarial loss: 0.372399
epoch 19; iter: 0; batch classifier loss: 0.284389; batch adversarial loss: 0.414487
epoch 19; iter: 200; batch classifier loss: 0.301479; batch adversarial loss: 0.320359
epoch 20; iter: 0; batch classifier loss: 0.313817; batch adversarial loss: 0.419522
epoch 20; iter: 200; batch classifier loss: 0.379259; batch adversarial loss: 0.532723
epoch 21; iter: 0; batch classifier loss: 0.369089; batch adversarial loss: 0.404500
epoch 21; iter: 200; batch classifier loss: 0.410190; batch adversarial loss: 0.506827
epoch 22; iter: 0; batch classifier loss: 0.388634; batch adversarial loss: 0.345454
epoch 22; iter: 200; batch classifier loss: 0.285386; batch adversarial loss: 0.441961
epoch 23; iter: 0; batch classifier loss: 0.338955; batch adversarial loss: 0.396951
epoch 23; iter: 200; batch classifier loss: 0.362992; batch adversarial loss: 0.466892
epoch 24; iter: 0; batch classifier loss: 0.334351; b

epoch 16; iter: 200; batch classifier loss: 0.498852; batch adversarial loss: 0.416481
epoch 17; iter: 0; batch classifier loss: 0.363910; batch adversarial loss: 0.483675
epoch 17; iter: 200; batch classifier loss: 0.367274; batch adversarial loss: 0.385080
epoch 18; iter: 0; batch classifier loss: 0.430218; batch adversarial loss: 0.581949
epoch 18; iter: 200; batch classifier loss: 0.542023; batch adversarial loss: 0.351700
epoch 19; iter: 0; batch classifier loss: 0.445347; batch adversarial loss: 0.375560
epoch 19; iter: 200; batch classifier loss: 0.394718; batch adversarial loss: 0.373724
epoch 20; iter: 0; batch classifier loss: 0.355979; batch adversarial loss: 0.374033
epoch 20; iter: 200; batch classifier loss: 0.394842; batch adversarial loss: 0.338979
epoch 21; iter: 0; batch classifier loss: 0.289139; batch adversarial loss: 0.459683
epoch 21; iter: 200; batch classifier loss: 0.317775; batch adversarial loss: 0.407746
epoch 22; iter: 0; batch classifier loss: 0.376881; b

epoch 14; iter: 200; batch classifier loss: 0.404227; batch adversarial loss: 0.357855
epoch 15; iter: 0; batch classifier loss: 0.337917; batch adversarial loss: 0.358124
epoch 15; iter: 200; batch classifier loss: 0.343510; batch adversarial loss: 0.429948
epoch 16; iter: 0; batch classifier loss: 0.403162; batch adversarial loss: 0.395869
epoch 16; iter: 200; batch classifier loss: 0.295047; batch adversarial loss: 0.431674
epoch 17; iter: 0; batch classifier loss: 0.300823; batch adversarial loss: 0.463852
epoch 17; iter: 200; batch classifier loss: 0.318214; batch adversarial loss: 0.463803
epoch 18; iter: 0; batch classifier loss: 0.433690; batch adversarial loss: 0.549926
epoch 18; iter: 200; batch classifier loss: 0.386634; batch adversarial loss: 0.373046
epoch 19; iter: 0; batch classifier loss: 0.432376; batch adversarial loss: 0.363126
epoch 19; iter: 200; batch classifier loss: 0.425353; batch adversarial loss: 0.395887
epoch 20; iter: 0; batch classifier loss: 0.403884; b

epoch 12; iter: 200; batch classifier loss: 0.414487; batch adversarial loss: 0.380285
epoch 13; iter: 0; batch classifier loss: 0.305574; batch adversarial loss: 0.385314
epoch 13; iter: 200; batch classifier loss: 0.440313; batch adversarial loss: 0.391143
epoch 14; iter: 0; batch classifier loss: 0.409540; batch adversarial loss: 0.387303
epoch 14; iter: 200; batch classifier loss: 0.301175; batch adversarial loss: 0.396913
epoch 15; iter: 0; batch classifier loss: 0.345091; batch adversarial loss: 0.493888
epoch 15; iter: 200; batch classifier loss: 0.336915; batch adversarial loss: 0.431500
epoch 16; iter: 0; batch classifier loss: 0.349520; batch adversarial loss: 0.292399
epoch 16; iter: 200; batch classifier loss: 0.391407; batch adversarial loss: 0.456809
epoch 17; iter: 0; batch classifier loss: 0.459891; batch adversarial loss: 0.425709
epoch 17; iter: 200; batch classifier loss: 0.361005; batch adversarial loss: 0.304301
epoch 18; iter: 0; batch classifier loss: 0.550820; b

epoch 10; iter: 200; batch classifier loss: 0.480698; batch adversarial loss: 0.471197
epoch 11; iter: 0; batch classifier loss: 0.365031; batch adversarial loss: 0.411499
epoch 11; iter: 200; batch classifier loss: 0.350276; batch adversarial loss: 0.398122
epoch 12; iter: 0; batch classifier loss: 0.418294; batch adversarial loss: 0.359156
epoch 12; iter: 200; batch classifier loss: 0.354294; batch adversarial loss: 0.456483
epoch 13; iter: 0; batch classifier loss: 0.424236; batch adversarial loss: 0.459486
epoch 13; iter: 200; batch classifier loss: 0.439383; batch adversarial loss: 0.576903
epoch 14; iter: 0; batch classifier loss: 0.393438; batch adversarial loss: 0.504667
epoch 14; iter: 200; batch classifier loss: 0.694845; batch adversarial loss: 0.398135
epoch 15; iter: 0; batch classifier loss: 0.373941; batch adversarial loss: 0.344724
epoch 15; iter: 200; batch classifier loss: 0.402266; batch adversarial loss: 0.514799
epoch 16; iter: 0; batch classifier loss: 0.375403; b

epoch 8; iter: 200; batch classifier loss: 1.357438; batch adversarial loss: 0.346325
epoch 9; iter: 0; batch classifier loss: 0.570448; batch adversarial loss: 0.369236
epoch 9; iter: 200; batch classifier loss: 0.862507; batch adversarial loss: 0.481436
epoch 10; iter: 0; batch classifier loss: 0.522778; batch adversarial loss: 0.420725
epoch 10; iter: 200; batch classifier loss: 0.373729; batch adversarial loss: 0.418379
epoch 11; iter: 0; batch classifier loss: 0.389506; batch adversarial loss: 0.309244
epoch 11; iter: 200; batch classifier loss: 0.405325; batch adversarial loss: 0.457257
epoch 12; iter: 0; batch classifier loss: 0.409589; batch adversarial loss: 0.340899
epoch 12; iter: 200; batch classifier loss: 0.357773; batch adversarial loss: 0.457539
epoch 13; iter: 0; batch classifier loss: 0.304432; batch adversarial loss: 0.463023
epoch 13; iter: 200; batch classifier loss: 0.730419; batch adversarial loss: 0.442870
epoch 14; iter: 0; batch classifier loss: 0.431094; batc

epoch 6; iter: 200; batch classifier loss: 1.189098; batch adversarial loss: 0.360471
epoch 7; iter: 0; batch classifier loss: 1.941715; batch adversarial loss: 0.528179
epoch 7; iter: 200; batch classifier loss: 1.911253; batch adversarial loss: 0.395282
epoch 8; iter: 0; batch classifier loss: 1.268459; batch adversarial loss: 0.380777
epoch 8; iter: 200; batch classifier loss: 0.771042; batch adversarial loss: 0.420907
epoch 9; iter: 0; batch classifier loss: 1.041976; batch adversarial loss: 0.427602
epoch 9; iter: 200; batch classifier loss: 0.813012; batch adversarial loss: 0.364965
epoch 10; iter: 0; batch classifier loss: 0.474564; batch adversarial loss: 0.434667
epoch 10; iter: 200; batch classifier loss: 0.531462; batch adversarial loss: 0.348332
epoch 11; iter: 0; batch classifier loss: 0.644241; batch adversarial loss: 0.496772
epoch 11; iter: 200; batch classifier loss: 2.794575; batch adversarial loss: 0.447497
epoch 12; iter: 0; batch classifier loss: 0.418283; batch ad

epoch 4; iter: 200; batch classifier loss: 0.670600; batch adversarial loss: 0.469174
epoch 5; iter: 0; batch classifier loss: 0.923448; batch adversarial loss: 0.496715
epoch 5; iter: 200; batch classifier loss: 0.415677; batch adversarial loss: 0.450612
epoch 6; iter: 0; batch classifier loss: 2.494823; batch adversarial loss: 0.309492
epoch 6; iter: 200; batch classifier loss: 0.533860; batch adversarial loss: 0.478045
epoch 7; iter: 0; batch classifier loss: 0.787063; batch adversarial loss: 0.355022
epoch 7; iter: 200; batch classifier loss: 0.807467; batch adversarial loss: 0.381915
epoch 8; iter: 0; batch classifier loss: 0.891738; batch adversarial loss: 0.373431
epoch 8; iter: 200; batch classifier loss: 0.474663; batch adversarial loss: 0.457973
epoch 9; iter: 0; batch classifier loss: 0.424917; batch adversarial loss: 0.371413
epoch 9; iter: 200; batch classifier loss: 0.518040; batch adversarial loss: 0.446461
epoch 10; iter: 0; batch classifier loss: 0.425538; batch advers

epoch 2; iter: 200; batch classifier loss: 11.996395; batch adversarial loss: 0.580438
epoch 3; iter: 0; batch classifier loss: 1.880728; batch adversarial loss: 0.550961
epoch 3; iter: 200; batch classifier loss: 1.629515; batch adversarial loss: 0.433547
epoch 4; iter: 0; batch classifier loss: 5.742574; batch adversarial loss: 0.511901
epoch 4; iter: 200; batch classifier loss: 1.285036; batch adversarial loss: 0.428534
epoch 5; iter: 0; batch classifier loss: 1.638271; batch adversarial loss: 0.500204
epoch 5; iter: 200; batch classifier loss: 3.124608; batch adversarial loss: 0.427560
epoch 6; iter: 0; batch classifier loss: 1.472153; batch adversarial loss: 0.480381
epoch 6; iter: 200; batch classifier loss: 3.251405; batch adversarial loss: 0.416129
epoch 7; iter: 0; batch classifier loss: 0.951004; batch adversarial loss: 0.528275
epoch 7; iter: 200; batch classifier loss: 0.420186; batch adversarial loss: 0.345917
epoch 8; iter: 0; batch classifier loss: 0.416968; batch advers

epoch 0; iter: 200; batch classifier loss: 6.966563; batch adversarial loss: 0.619411
epoch 1; iter: 0; batch classifier loss: 11.899750; batch adversarial loss: 0.596918
epoch 1; iter: 200; batch classifier loss: 3.605996; batch adversarial loss: 0.519822
epoch 2; iter: 0; batch classifier loss: 2.431391; batch adversarial loss: 0.558654
epoch 2; iter: 200; batch classifier loss: 1.908173; batch adversarial loss: 0.536450
epoch 3; iter: 0; batch classifier loss: 2.704957; batch adversarial loss: 0.428824
epoch 3; iter: 200; batch classifier loss: 2.023658; batch adversarial loss: 0.423146
epoch 4; iter: 0; batch classifier loss: 0.612531; batch adversarial loss: 0.457823
epoch 4; iter: 200; batch classifier loss: 1.992843; batch adversarial loss: 0.404866
epoch 5; iter: 0; batch classifier loss: 2.235471; batch adversarial loss: 0.346411
epoch 5; iter: 200; batch classifier loss: 2.007464; batch adversarial loss: 0.377987
epoch 6; iter: 0; batch classifier loss: 2.075370; batch advers

epoch 48; iter: 200; batch classifier loss: 0.339236; batch adversarial loss: 0.489796
epoch 49; iter: 0; batch classifier loss: 0.313936; batch adversarial loss: 0.403936
epoch 49; iter: 200; batch classifier loss: 0.301604; batch adversarial loss: 0.380497
epoch 0; iter: 0; batch classifier loss: 5.011238; batch adversarial loss: 0.769551
epoch 0; iter: 200; batch classifier loss: 7.425304; batch adversarial loss: 0.673340
epoch 1; iter: 0; batch classifier loss: 10.360415; batch adversarial loss: 0.639864
epoch 1; iter: 200; batch classifier loss: 5.809455; batch adversarial loss: 0.553599
epoch 2; iter: 0; batch classifier loss: 2.536075; batch adversarial loss: 0.569865
epoch 2; iter: 200; batch classifier loss: 2.650975; batch adversarial loss: 0.489792
epoch 3; iter: 0; batch classifier loss: 2.529573; batch adversarial loss: 0.528549
epoch 3; iter: 200; batch classifier loss: 1.404363; batch adversarial loss: 0.468081
epoch 4; iter: 0; batch classifier loss: 1.008184; batch adv

epoch 46; iter: 200; batch classifier loss: 0.287770; batch adversarial loss: 0.455077
epoch 47; iter: 0; batch classifier loss: 0.354398; batch adversarial loss: 0.548461
epoch 47; iter: 200; batch classifier loss: 0.360498; batch adversarial loss: 0.385005
epoch 48; iter: 0; batch classifier loss: 0.323066; batch adversarial loss: 0.389321
epoch 48; iter: 200; batch classifier loss: 0.234073; batch adversarial loss: 0.348227
epoch 49; iter: 0; batch classifier loss: 0.428525; batch adversarial loss: 0.376637
epoch 49; iter: 200; batch classifier loss: 0.403683; batch adversarial loss: 0.389120
epoch 0; iter: 0; batch classifier loss: 34.946327; batch adversarial loss: 1.059033
epoch 0; iter: 200; batch classifier loss: 7.362378; batch adversarial loss: 0.867830
epoch 1; iter: 0; batch classifier loss: 1.984664; batch adversarial loss: 0.686881
epoch 1; iter: 200; batch classifier loss: 4.448523; batch adversarial loss: 0.683302
epoch 2; iter: 0; batch classifier loss: 6.946095; batch

epoch 44; iter: 200; batch classifier loss: 0.372456; batch adversarial loss: 0.367216
epoch 45; iter: 0; batch classifier loss: 0.347583; batch adversarial loss: 0.461049
epoch 45; iter: 200; batch classifier loss: 0.367362; batch adversarial loss: 0.332134
epoch 46; iter: 0; batch classifier loss: 0.382968; batch adversarial loss: 0.481443
epoch 46; iter: 200; batch classifier loss: 0.349227; batch adversarial loss: 0.436819
epoch 47; iter: 0; batch classifier loss: 0.362101; batch adversarial loss: 0.349596
epoch 47; iter: 200; batch classifier loss: 0.393379; batch adversarial loss: 0.505211
epoch 48; iter: 0; batch classifier loss: 0.344760; batch adversarial loss: 0.385158
epoch 48; iter: 200; batch classifier loss: 0.294458; batch adversarial loss: 0.358625
epoch 49; iter: 0; batch classifier loss: 0.360043; batch adversarial loss: 0.362116
epoch 49; iter: 200; batch classifier loss: 0.358058; batch adversarial loss: 0.427663
epoch 0; iter: 0; batch classifier loss: 26.145233; b

## Step 5: Render visualization of search results

Along with the ability to run a grid search, fairkit-learn also provides functionality to visualize the results of the grid search. Fairkit-learn uses Bokeh to render a visualization within the notebook, which you can use when completing the next task to explore trained models' performance and fairness.

The visualzation includes a graph that plots the search results that are pareto optimal. Each data point in the graph is a model with its own settings (e.g., hyper-parameters, pre/post processing). Each model class has its own color to make it easier to see which models are being shown in the visualization. To get more information on each model's settings, hover over the data point of interest; a tooltip will pop up with model settings.

Within the visualization, you can control what metrics and models are being included in the visualization. The drop down menus allow you to specify the x and y axes for the graph. The checklist below the list of models allows you to select which metrics can be considered in the graph.

To view the *Pareto frontier* for any two metrics (e.g., accuracy and disparate impact), select those two metrics from the drop down menu and **only** check those boxes in the checklist.

Below we provide code to load Bokeh and plot the results from the search in the interactive plot.

In [None]:
# Import packages for visualization
from bokeh.io import output_notebook
from bokeh.application.handlers import FunctionHandler
from bokeh.application import Application

# load Bokeh
output_notebook()

In [None]:
from fklearn.interface.plot import *

# Define function that takes in a document and attaches the bokeh server to it
def modify_doc(doc):
    
    # Load custom styles (for notebook only)
    custom_css = Div(text="<link rel='stylesheet' type='text/css' href='fklearn/interface/static/css/styles-notebook.css'>")
    add_btn = Button(label="Add Plot", button_type="success")
    remove_btn = Button(label="Remove Plot", button_type="danger")

    # Construct our viewport
    l = layout([
        [custom_css],
        create_plot("fklearn/interface/static/data/test-file.csv")
    ], sizing_mode="fixed", css_classes=["layout-container"])

    doc.add_root(l)
    
# Set up the application
handler = FunctionHandler(modify_doc)
app = Application(handler)

# Render visualization in the notebook
show(app)

## Step 6: Export visualization (optional)

The visualization can be viewed within this notebook and re-rendered as many times as needed, but can also be exported for future viewing and comparison to other plots. You can export the visualization and relevant information by clicking the ``Export Plot`` button in the visualization. 

This will save create two files: plot.png and plot.json.
Plot.png is an image of the plot.
Plot.json is a JSON file with the informational bits from the plot, such as what models are being shown and what metrics are selected.

Each time the export button is clicked, if plot.png and plot.json exist they are overwritten. If you wish to save plots for comparision, make sure you rename each file after export.

## Step 7: Evaluate overall model quality

Now that we've explored the various model configurations and their performance and fairness, we are ready to select the model(s) that we want to evaluate for overall quality.

To do so, you will need to create and train the model(s) (with proper hyperparameters, pre-processing, and post-processing as specified in the search output) you selected and then evaluate overall model quality.

Below we provide (commented) code that shows how to intialize the various models and algorithms you have access to. You can use the code provided or modify as you see fit when completing the task.

In [None]:
# split dataset for evaluation
# data_orig_train, data_orig_test = data_orig.split([0.7], shuffle=False)

# model is populated with default values; modifying parameters is allowed but optional
# model = LogisticRegression(penalty='l2', dual=False,tol=0.0001,C=1.0,
#                       fit_intercept=True,intercept_scaling=1,class_weight=None,
#                       random_state=None,solver='liblinear',max_iter=100, 
#                       multi_class='warn',verbose=0,warm_start=False,
#                       n_jobs=None)

#model = KNeighborsClassifier(n_neighbors=5,weights='uniform',algorithm='auto',
#                          leaf_size=30,p=2,metric='minkowski',metric_params=None,
#                          n_jobs=None)

#model = RandomForestClassifier(n_estimators='warn',criterion='gini',max_depth=None,
#                            min_samples_leaf=1,min_weight_fraction_leaf=0.0,
#                            min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, 
#                             random_state=None, verbose=0, warm_start=False, class_weight=None)

#model = SVC(C=1.0, kernel='rbf', degree=3, gamma='auto_deprecated', coef0=0.0, shrinking=True, 
#          probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, 
#          max_iter=-1, decision_function_shape='ovr', random_state=None)

# If this is not your first time creating the Adversarial Debiasing model, to avoid future errors,
# uncomment the code below before running the code that initializing TensorFlow session and model:
# sess.close()
# tf.reset_default_graph()

#sess = tf.Session()
#model = AdversarialDebiasing(privileged_groups=privileged,
#                          unprivileged_groups=unprivileged,
#                          scope_name='debiased_classifier',
#                          debias=True,
#                          sess=sess)


# you can modify repair level (optional)
#pre_alg = DisparateImpactRemover(repair_level=1.0)
# training data
#pre_train_data = pre_alg.fit_transform(data_orig_train)
# test data
#pre_test_data = pre_alg.fit_transform(data_orig_test)



# Reweighing

#pre_alg = Reweighing(unprivileged_groups=unprivileged, privileged_groups=privileged)
# train
#pre_alg = pre_alg.fit(dataset_orig_train)
#pre_train_data = pre_alg.transform(data_orig_train)
# test
#pre_alg = pre_alg.fit(dataset_orig_test)
#pre_test_data = pre_alg.transform(data_orig_test)


# train model with pre-processed data
#model.fit(pre_train_data)

# train model with original data 
# model.fit(data_orig_train)


# process trained model
# Calibrated Equal Odds
#post_alg = CalibratedEqOddsPostprocessing(unprivileged_groups=unprivileged,
#                                         privileged_groups=privileged,
#                                         cost_constraint='weighted',
#                                         seed=None)



# Reject Option Classification 
# With this algorithm, you can specify "metric_name" with the metric you want to optimize for.
# The options are "Statistical parity difference", "Average odds difference", or "Equal opportunity difference"

# post_alg = RejectOptionClassification(unprivileged_groups=unprivileged,
#                                      privileged_groups=privileged,
#                                      low_class_thresh=0.01,
#                                      high_class_thresh=0.99,num_class_thresh=100, 
#                                      num_ROC_margin=50,metric_name='Statistical parity difference',
#                                      metric_ub=0.05, metric_lb=-0.05)



# test with pre-processed test data
#predictions = model.predict(pre_test_data)

# test with original test data
# predictions = model.predict(data_orig_test)


# fit with post-processing model using pre-processed data
#post_model = post_alg.fit(pre_test_data, predictions)

# fit with post-processing model using original data
# post_model = post_alg.fit(data_orig_test, predictions)



# update predictions using post-processed model
#predictions = post_model.predict(pre_test_data)

# evaluate overall model quality on post-processed model
#quality_score = classifier_quality_score(post_model, predictions, 
#                                             unprivileged_groups=unprivileged, 
#                                             privileged_groups=privileged)

# evaluate overall model quality on model without post-processing
#quality_score = classifier_quality_score(model, predictions, 
#                                             unprivileged_groups=unprivileged, 
#                                             privileged_groups=privileged)

#print("Overall quality = " + str(quality_score))

# Task 3: Model evaluation with fairkit-learn

Your turn again! Use what you learned in the above tutorial to train and evaluate models for performance, fairness, and overall quality. You will use functionality provided by fairkit-learn to meet the following goals:

1. **Describe a model you believe will perform the best (e.g., have the highest accuracy score).** 

2. **Describe a model you believe will be the most fair, regardless of performance (e.g., minimizes the value of difference fairness metrics or maximizes disparate impact).** 

3. **Describe a model you believe will best balance both performance and fairness (e.g., have the highest classifier quality score).** 

Make sure you include any modifications to model hyper-parameters and any pre-/post-processing algorithms used. **As a reminder, there is no "absolute best" model for each of the above goals. You are expected to explore the space of model configurations available to find a model that best meets the above goals.**

**Keep in mind, training machine learning models is often a time intensive endeavor.** One way you can minimize time to finish this task is to minimize the search space (e.g., number of models included in a single search). You can also minimize time when evaluating the number of times you have to, for example, train a given model to then evaluate it. You can do this by putting the code that initializes and trains your model(s) in its own separate cell and only execute this cell when needed.

## Submitting your response

Once you feel you've met the above goals, go to the Evaluating ML Models Exercise Response Form to enter your responses under the section labeled 'Task 3'.

If you accidentally closed your response form, check your email for the link to re-open it.

In [None]:
# TODO : Use this cell to write code for completing task 3



Once you've completed this final task, make sure you're satisfied with your responses, complete the exercise feedback portion, and submit the form.