# Evaluating Machine Learning Models

Today, we will use machine learning tools to train models while being careful of model fairness.

First, we will use **scikit-learn** to train and evaluate models using the ProPublica COMPAS Dataset.

Next, we will use **AI Fairness 360** to train and evaluate models using the German Credit Dataset. 

Finally, we will use **fairkit-learn** to train and evaluate models using the Adult Census Income Dataset. 

Along with the provided tooling and resources within this notebook, you will be allowed to use outside resources (e.g. Google) to help you complete this exercise.

Please plan to complete the entire exercise in one sitting. Make sure you have time and your computer is plugged into power before you start; you'll be running machine learning algorithms, which will wear your battery down.

Responses for this exercise will be entered in the <a href="https://form.jotform.com/92474488429169" target="_blank">Evaluating ML Models Exercise Response Form</a>. You will first be asked some demographic questions then each page that follows maps to each task you complete. You will be expected to enter responses regarding each task and will have to submit for your assignment to be graded.


## Models

Because there are a variety of models provided by scikit-learn and AI Fairness 360, we will only use a subset for this assignment. The models you will be evaluating are as follows:

* **Logistic Regression**: a Machine Learning algorithm which is used for the classification problems, it is a predictive analysis algorithm and based on the concept of probability. [More info here.](https://machinelearningmastery.com/logistic-regression-for-machine-learning/) [Scikit-learn documentation](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html)
* **K Nearest Neighbor Classifier**: a model that classifies data points based on the points that are most similar to it. It uses test data to make an “educated guess” on what an unclassified point should be classified as. [More info here.](https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761) [Scikit-learn documentation](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html)
* **Random Forest**: an ensemble machine learning algorithm that is used for classification and regression problems. Random forest applies the technique of bagging (bootstrap aggregating) to decision tree learners. [More info here.](https://towardsdatascience.com/understanding-random-forest-58381e0602d2) [Scikit-learn documentation](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html)
* **Support Vector Classifier**:  a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples. In two dimentional space this hyperplane is a line dividing a plane in two parts where in each class lay in either side. [More info here.](https://medium.com/machine-learning-101/chapter-2-svm-support-vector-machine-theory-f0812effc72) [Scikit-learn documentation](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html)
* **Adversarial Debiasing**: learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary's ability to determine the protected attribute from the predictions. [Documentation.](https://aif360.readthedocs.io/en/latest/modules/inprocessing.html#adversarial-debiasing)

The Adversarial Debiasing model is only available for use when using AI Fairness 360 or fairkit-learn.


## Bias Mitigating Algorithms

When using AI Fairness 360 and fairkit-learn, you will have access to the following bias mitigating pre- and post- processing algorithms:

* **Pre-processing algorithms**
    - *Disparate Impact Remover*: a preprocessing technique that edits feature Values increase group fairness while preserving rank-ordering within groups
    - *Reweighing*: a preprocessing technique that Weights the examples in each (group, label) combination differently to ensure fairness before classification
    
    
* **Post-processing algorithms**
    - *Calibrated Equalized Odds*: a post-processing technique that optimizes over calibrated classifier score outputs to find probabilities with which to change output labels with an equalized odds objective
    - *Reject Option Classification*: a postprocessing technique that gives favorable outcomes to unpriviliged groups and unfavorable outcomes to priviliged groups in a confidence band around the decision boundary with the highest uncertainty 


## Model Evaluation Metrics

To evaluate your trained models, you will be using one or more of the following metrics:

* **Performance metrics**:
    - *Accuracy Score* (UnifiedMetricLibrary.accuracy_score) When evaluating a model with this metric, the goal is to *maximize* the value.
    
    
* **Fairness Metrics**:
    - *Equal Opportunity Difference* (UnifiedMetricLibrary.equal_opportunity_difference) also known as "true positive rate difference". When evaluating a model with this metric, the goal is to *minimize* the value.
    - *Average Odds Difference* (UnifiedMetricLibrary.average_odds_difference) When evaluating a model with this metric, the goal is to *minimize* the value.
    - *Statistical Parity Difference* (UnifiedMetricLibrary.mean_difference) also known as "mean difference". When evaluating a model with this metric, the goal is to *minimize* the value.
    - *Disparate Impact* (UnifiedMetricLibrary.disparate_impact)  When evaluating a model with this metric, the goal is to *maximize* the value.
    
    
* **Overall Model Quality**:
    - *Classifier Quality Score* (classifier_quality_score) When evaluating a model with this metric, the goal is to *maximize* the value.

## Getting started

Before beginning task 1, make sure to run the following cell to import all necessary packages. If you need any additional packages, add the import statement(s) to the cell below and re-run the cell before adding and running code that uses the additional packages. 

**For this task you are only allowed to use functionality provided by scikit-learn to train and evaluate your models. If you have your own custom code you would like to add to evaluate your models, you may do so (without using functionality provided by the tools used in the previous tasks).**


In [None]:
# Load all necessary packages
import numpy as np
import sklearn as skl
import six

# dataset
from aif360.datasets import CompasDataset

# models
from sklearn.linear_model.logistic import LogisticRegression 
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier 
from sklearn.svm import SVC 

# metric
from sklearn.metrics import accuracy_score

# Tutorial 1: scikit-learn

First, we show you how to train and evaluate models using scikit-learn. You will use the knowledge from this tutorial to complete Task 2, so please read thoroughly and execute the code cells in order.

## Step 1: Import the dataset

First we need to import the dataset we will use for training and testing our model.

Below, we provide code that imports the COMPAS recidivism dataset. **Note: a warning may pop up when you run this cell. As long as you don't see any errors in the code, it is fine to continue.**


In [None]:
data_orig = CompasDataset()

## Step 2: Split the dataset into train and test data

Now that the dataset has been imported, we need to split the original dataset into training and test data. 

The code to do so is as follows:

In [None]:
data_orig_train, data_orig_test = data_orig.split([0.7], shuffle=False)

## Step 3: Initialize model 

Next, we need to initialize our model. We can initialize a model with the default parameters (see documentation), no parameters (which initializes with default parameter values), or we can modify parameter values.

For the tutorial, we use the Logistic Regression model with default hyper-parameter values; you will be able to use any of the scikit-learn models listed above, and modify hyper-parameter values, when completing the exercise. 

Below we provide code for initialzing the Logistic Regression model, with default hyper-parameter values. We also provide (commented) code that reminds you of how to initialize each model available during this exercise.


In [None]:
# model is populated with default values; modifying parameters is allowed but optional
model = LogisticRegression(penalty='l2', dual=False,tol=0.0001,C=1.0,
                       fit_intercept=True,intercept_scaling=1,class_weight=None,
                       random_state=None,solver='liblinear',max_iter=100, 
                       multi_class='warn',verbose=0,warm_start=False,
                       n_jobs=None)

#model = KNeighborsClassifier(n_neighbors=5,weights='uniform',algorithm='auto',
#                          leaf_size=30,p=2,metric='minkowski',metric_params=None,
#                          n_jobs=None)

#model = RandomForestClassifier(n_estimators='warn',criterion='gini',max_depth=None,
#                            min_samples_leaf=1,min_weight_fraction_leaf=0.0,
#                            min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, 
#                             random_state=None, verbose=0, warm_start=False, class_weight=None)

#model = SVC(C=1.0, kernel='rbf', degree=3, gamma='auto_deprecated', coef0=0.0, shrinking=True, 
#          probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, 
#          max_iter=-1, decision_function_shape='ovr', random_state=None)

## Step 4: Train the model

After initialing the model, we train it using the training dataset. 

Below we provide code that prepares our dataset to be used with scikit-learn and trains the model using our prepared data.

In [None]:
# prepare data for use with scikit-learn
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

x_train = scaler.fit_transform(data_orig_train.features)
y_train = data_orig_train.labels.ravel()


model.fit(x_train, y_train)

## Step 5: Evaluate the model

Now we're ready to evaluate your trained model with the test data using the performance metric provided by scikit-learn.

Below we provide code snippets that show how to evaluate a model's performance using scikit-learn.

In [1]:
x_test = scaler.fit_transform(data_orig_test.features)

predictions = model.predict(x_test)
accuracy = accuracy_score(data_orig_test.labels.ravel(), predictions)

print ('Accuracy = ' + str(accuracy))


NameError: name 'lr' is not defined

# Task 1: Model evaluation with scikit-learn

Your turn! Use what you learned in the above tutorial to train and evaluate models for performance, fairness, and overall quality. You will use functionality provided by scikit-learn to meet the following goals:

1. **Describe a model you believe will perform the best (e.g., have the highest accuracy score).** 

2. **Describe a model you believe will be the most fair, regardless of performance.** 

3. **Describe a model you believe will best balance both performance and fairness.** 

Make sure you include any modifications to model hyper-parameters. **As a reminder, there is no "absolute best" model for each of the above goals. You are expected to explore the space of model configurations available to find a model that best meets the above goals.**

**Keep in mind, training machine learning models is often a time intensive endeavor.** One way you can minimize time to finish the assignment is to minimize the times you have to, for example, train a given model to then evaluate it. You can do this by putting the code that initializes and trains your model(s) in its own separate cell and only execute this cell when needed.

## Submitting your response 

Once you feel you've met the above goals, go to the Evaluating ML Models Exercise Response Form to enter your responses under the section labeled 'Task 1'. 

If you haven't opened/started a response form yet, click <a href="https://form.jotform.com/92474488429169" target="_blank">here</a> to get started.

If you accidentally closed your response form, check your email for the link to re-open it.

In [None]:
# TODO : Use this cell to write code for completing task 1





When you're ready to go on to the next task, open a new tab and click <a href="http://localhost:8888/notebooks/Task_2.ipynb" target="_blank">here</a>.