<h1 align='center'>Ethics in Artificial Intelligence</h1>

<h3 align='center'>Laura G. Funderburk</h3>

<h3 align='center'>Data Scientist, Cybera</h3>



<h2 align='center'>What is Artificial Intelligence (AI)</h2>

<h2 align='center'>What is Ethics</h2>

<h2 align='center'>Why Ethics in AI matter</h2>


 AI systems can behave unfairly for a variety of reasons: 

1. Societal biases are reflected in the training data. 

2. Societal biases are reflected and in the decisions made during the development and deployment of these systems. 

3. AI systems behave unfairly because of characteristics of the data or characteristics of the systems themselves. 

$$\Rightarrow \text{They are not mutually exclusive and often exacerbate one another} \Leftarrow$$ 



<h2 align='center'>How can we determine whether an AI is behaving unfairly?</h2>


1. Through identifying underlying misconceptions within AI - (causal: societal-based contenxt/bias, or in terms of intent, such as prejudice)
2. Through study of impact of AI on people - (outcome: harms vs gains)

**For this workshop, we define whether an AI system is behaving unfairly in terms of its impact on people.**

<h2 align='center'>A note on the word bias</h2>

Since we define fairness in terms of **harms** rather than specific **causes** (such as societal-based context), we avoid the usage of the words *bias* or *debiasing*.


<h2 align='center'>Types of harms</h2>

From keynote by K. Crawford at NeurIPS 2017 <https://www.youtube.com/watch?v=fMym_BKWQzk>

* *Allocation harms* can occur when AI systems extend or withhold
  opportunities, resources, or information. 
  
  **Sample key applications: hiring, school admissions, and lending.**
  

<h2 align='center'>Types of harms</h2>

From keynote by K. Crawford at NeurIPS 2017 <https://www.youtube.com/watch?v=fMym_BKWQzk>


* *Quality-of-service harms* can occur when a system does not work as well for
  one person as it does for another, even if no opportunities, resources, or
  information are extended or withheld. 
  
  **Sample key applications: accuracy in face recognition, document search, or product recommendation.**

<h2 align='center'>How we work with information: abstraction</h2>


**Abstracting in computer science:**  

This is the process of removing physical, spatial, or temporal details or attributes in the study of objects or systems with the goal of focusing attention on details deemed more important. 



<h2 align='center'>How we work with information: abstraction</h2>


**Abstracting in mathematics:** 

This is the process of 

1. Extracting the underlying structures, patterns or properties of a mathematical concept;

2. Removing any dependence on real world objects with which it might originally have been connected;

3. Generalizing it so that it has wider applications.


<h2 align='center'>How we work with information: abstraction</h2>


**Abstracting in machine learning:** 

Machine Learning encompasses all approaches (design and development of algorithms) that allow a computer to “learn”, based on a database of examples or sensor data (abstractions of real situations).

In Machine Learning abstraction manifests in algorithms learning from example (supervised/unsupervised learning), and learning from reinforcement (reinforcement learning).

<h2 align='center'>How we work with information: abstraction</h2>


**Abstracting in machine learning:** 

Machine Learning algorithms can be classified into three broad classes [1]

1. In supervised learning, an algorithm learns a function that maps inputs to a given set of labels (classes).

2. In unsupervised learning, an algorithm learns how to unravel hidden structures in unlabeled data (also called observations).

3. In reinforcement learning, an algorithm learns how an agent ought to take actions in an environment so as to maximize some reward.




<h2 align='center'>How do we fall into misrepresentations in machine learning?</h2>


$\Rightarrow$ When we abstract and remove attributes or properties that have dependence on a social context, how do we determine which properties are *worth* preserving and describing?

$\Rightarrow$ What is the trade off we make when discarding a property, and are we aware of the consequences of that trade off?

$\Rightarrow$ Can we guarantee capability to identify and quantify the consequences of removing a social-based attribute? Who is on the receiving end of these consequences?

<h2 align='center'>How we frame a problem is key to identifying problems in abstraction</h2>
<h3 align='center'>The algorithmic frame</h3>

Frame centered around choices made when abstracting a problem in the form of representations (input data) and labelling (outcome). 

Evaluated on accuracy and generalizability to data the model did not train on. 

**Fairness cannot be defined in this frame $\Rightarrow$ goal is to produce a model that best captures the relationship between representations and labels.** 

<h2 align='center'>How we frame a problem is key to identifying problems in abstraction</h2>
<h3 align='center'>The data frame</h3>

This frame is concerned with the quality of representations (input data) and the outcomes that result. 

**This additional frame allows us to question the inherent (un)fairness present in input and output data.**

For example, under this frame we can question whether our training dataset incorporates demographic and socio-economic information related to an algorithm providing recommendations, and assesses the quality of those recommendations based on demographic and socio-economic information.

<h2 align='center'>How we frame a problem is key to identifying problems in abstraction</h2>
<h3 align='center'>The sociotechnical frame</h3>

$$\text{Technical Component} + \text{Social Component} = \text{Sociotechnical system}$$


This frame recognizes that a machine learning model is part of the interaction between people and technology, and thus any social components of this interaction need to be modelled and incorporated accordingly.

**Designers of machine learning systems who fail to consider the way in which social contexts and technologies
are interrelated are at risk of falling into "abstraction traps".**

Assessing risk of re-engagement in criminal behaviour in an individual
charged with an offense, and appropriate measures to prevent relapse, and failing
to consider factors such as race, socio-economic status, mental health, along with
socially-dependent views present in judges, police officers, or any actors responsible
for recommending a course of action.



In the algorithmic framework, for example, input variables may contain previous criminal history,
statements taken by the accused, witnesses and police officers. Labels (outcome)
include recommendations by the algorithm on an appropriate course of action based
on a computed risk score. Model is limited in assessing fairness out outcome.



The data framework could attempt to reduce unfairness by studying socio-economic
information regarding the accused, their upbringing and how it relates to their
current status, along with a recommendation that incorporates these factors into their
recovery.



Within the sociotechnical framework the model incorporates not only more nuanced
data on the history of the case, but also the social context in which judging and
charging people with offenses take place. This model incorporates the processes
associated with crime reporting, the offense-trial pipeline, and identifies areas
in which different people interact with one another as outcomes are recommended.

<h2 align='center'>Common "traps" we can fall into</h2>

Selbst *et al.* [2] identify five traps we can fall into when implementing a machine learning model:

* The Solutionism Trap

* The Ripple Effect Trap

* The Formalism Trap

* The Portability Trap

* The Framing Trap

<h2 align='center'>Common "traps" we can fall into</h2>
<h3 align='center'>The Solutionism Trap</h3>


This trap occurs when we assume that the best solution to a problem
involves technology, and fail to recognize other possible solutions outside of
this realm. One area where this manifests in is within contexts in which the
definition "fairness" changes or is dependent on a political context.


<h2 align='center'>Common "traps" we can fall into</h2>
<h3 align='center'>The Ripple Effect Trap</h3>



<h2 align='center'>Common "traps" we can fall into</h2>
<h3 align='center'>The Formalism Trap</h3>

This trap occurs when implementing an algorithmic solution that fails to take
into account the social dimensions associated to fairness in a situation. These
dimensions include "procedurality", "contextuality" and "contestability". Such
dimensions can often not be resolved via a purely mathematical framework.


<h2 align='center'>Common "traps" we can fall into</h2>
<h3 align='center'>The Portability Trap</h3>


This trap occurs when we fail to understand how reusing a model or
algorithm that is designed for one specific social context, may not necessarily
apply to a different social context. Reusing an algorithmic solution and failing
to take into account differences in involved social contexts can result in misleading
results and potentially harmful consequences if the algorithm is used to determine the
fate of an individual.

<h2 align='center'>Common "traps" we can fall into</h2>
<h3 align='center'>The Framing Trap</h3>

This trap occurs when we fail to consider the full picture surrounding
a particular social context when abstracting a social problem, and implementing an
algorithm in which the outcome involves enforcing decisions that will impact a person
or group of people. It can occur at the data collection, data selection for learning, data processing, and labelling stages.

<h2 align='center'>What does this look like in practice?</h2>



Reusing a machine learning algorithm used to screen job applications in the
nursing industry, for job applications in the information technology sector. An intuitive
yet important difference between both contexts is the difference in skills required to
succeed in both industries. A slightly more subtle difference is the demographic differences
in number of genders who typically work in each of these industries, resulting from wording in job postings,
social constructs on gender and societal roles, and the male-female ratio of successful
applicants in each field.

<h2 align='center'>Introducing FairLearn:<br> a Python library focused on decreasing unfairness in machine learning models </h2>

This dataset is a classification problem - given a range of data about 32,000 individuals, predict whether their annual income is above or below fifty thousand dollars per year.

For the purposes of this notebook, we shall treat this as a loan decision problem. The label indicates whether or not each individual repaid a loan in the past. We will use the data to train a predictor to predict whether previously unseen individuals will repay a loan or not. 

**The assumption is that the model predictions are used to decide whether an individual should be offered a loan.**

In [None]:
from sklearn.model_selection import train_test_split
from fairlearn.reductions import GridSearch
from fairlearn.reductions import DemographicParity, ErrorRate
from fairlearn.metrics import MetricFrame, selection_rate#, count
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn import metrics as skm
import pandas as pd

Exploring data

In [None]:
from sklearn.datasets import fetch_openml
data = fetch_openml(data_id=1590, as_frame=True)
X_raw = data.data
Y = (data.target == '>50K') * 1
X_raw

We are going to treat the sex of each individual as a sensitive feature (where 0 indicates female and 1 indicates male), and in this particular case we are going separate this feature out and drop it from the main data. We then perform some standard data preprocessing steps to convert the data into a format suitable for the ML algorithms

In [None]:
# Isolate 'sex' column, create dataframe without 'sex'
A = X_raw["sex"]
X = X_raw.drop(labels=['sex'], axis=1)
X = pd.get_dummies(X)

X.head()

In [None]:
# Scale data 
sc = StandardScaler()
# Apply scaling to dataframe
X_scaled = sc.fit_transform(X)
X_scaled = pd.DataFrame(X_scaled, columns=X.columns)

# Set up label encoder
le = LabelEncoder()
Y = le.fit_transform(Y)

Finally, we split the data into training and test sets:

In [None]:
X_train, X_test, Y_train, Y_test, A_train, A_test = train_test_split(X_scaled,
                                                                     Y,
                                                                     A,
                                                                     test_size=0.2,
                                                                     random_state=0,
                                                                     stratify=Y)

# Work around indexing bug
X_train = X_train.reset_index(drop=True)
A_train = A_train.reset_index(drop=True)
X_test = X_test.reset_index(drop=True)
A_test = A_test.reset_index(drop=True)

<h2 align='center'>Training a fairness-unaware predictor</h2>

In [None]:
unmitigated_predictor = LogisticRegression(solver='liblinear', fit_intercept=True)

unmitigated_predictor.fit(X_train, Y_train)

We can start to assess the predictor's fairness using the `MetricFrame`:

In [None]:
metric_frame = MetricFrame(metric={"accuracy": skm.accuracy_score,
                                    "selection_rate": selection_rate},
                           sensitive_features=A_test,
                           y_true=Y_test,
                           y_pred=unmitigated_predictor.predict(X_test))
print(metric_frame.overall)
print(metric_frame.by_group)
metric_frame.by_group.plot.bar(
        subplots=True, layout=[3, 1], legend=False, figsize=[12, 8],
        title='Accuracy and selection rate by group');

Looking at the disparity in accuracy, we see that males have an error about three times greater than the females. More interesting is the disparity in opportunity - males are offered loans at three times the rate of females.

Despite the fact that we removed the feature from the training data, our predictor still discriminates based on sex. This demonstrates that simply ignoring a sensitive feature when fitting a predictor rarely eliminates unfairness. There will generally be enough other features correlated with the removed feature to lead to disparate impact.

<h2 align='center'>Mitigation with GridSearch</h2>

The fairlearn.reductions.GridSearch class implements a simplified version of the exponentiated gradient reduction of Agarwal et al. 2018. The user supplies a standard ML estimator, which is treated as a blackbox. GridSearch works by generating a sequence of relabellings and reweightings, and trains a predictor for each.

For this example, we specify demographic parity (on the sensitive feature of sex) as the fairness metric. Demographic parity requires that individuals are offered the opportunity (are approved for a loan in this example) independent of membership in the sensitive class (i.e., females and males should be offered loans at the same rate). We are using this metric for the sake of simplicity; in general, the appropriate fairness metric will not be obvious.

In [None]:
# Setting up GridSearch
sweep = GridSearch(LogisticRegression(solver='liblinear', fit_intercept=True),
                   constraints=DemographicParity(),
                   grid_size=71)

In [None]:
# Fit training data using GridSearch
sweep.fit(X_train, Y_train,
          sensitive_features=A_train)

predictors = sweep.predictors_

In [None]:
errors, disparities = [], []
for m in predictors:
    def classifier(X): return m.predict(X)

    error = ErrorRate()
    error.load_data(X_train, pd.Series(Y_train), sensitive_features=A_train)
    disparity = DemographicParity()
    disparity.load_data(X_train, pd.Series(Y_train), sensitive_features=A_train)

    errors.append(error.gamma(classifier)[0])
    disparities.append(disparity.gamma(classifier).max())

all_results = pd.DataFrame({"predictor": predictors, "error": errors, "disparity": disparities})

non_dominated = []
for row in all_results.itertuples():
    errors_for_lower_or_eq_disparity = all_results["error"][all_results["disparity"] <= row.disparity]
    if row.error <= errors_for_lower_or_eq_disparity.min():
        non_dominated.append(row.predictor)

In [None]:
predictions = {"unmitigated": unmitigated_predictor.predict(X_test)}
metric_frames = {"unmitigated": metric_frame}
for i in range(len(non_dominated)):
    key = "dominant_model_{0}".format(i)
    predictions[key] = non_dominated[i].predict(X_test)

    metric_frames[key] = MetricFrame(metric={"accuracy": skm.accuracy_score,
                                              "selection_rate": selection_rate},
                                     sensitive_features=A_test,
                                     y_true=Y_test,
                                     y_pred=predictions[key])

import matplotlib.pyplot as plt
x = [metric_frame.overall['accuracy'] for metric_frame in metric_frames.values()]
y = [metric_frame.difference()['selection_rate'] for metric_frame in metric_frames.values()]
keys = list(metric_frames.keys())
plt.scatter(x, y)
for i in range(len(x)):
    plt.annotate(keys[i], (x[i] + 0.0003, y[i]))
plt.xlabel("accuracy")
plt.ylabel("selection rate difference")

<h2 align='center'>References</h2>

[1] Saitta, Lorenza, Zucker, Jean-Daniel, "Abstraction in Artificial Intelligence and Complex Systems" (2013), Chapter "Abstraction in Machine Learning", pp. 273--327, Springer New York,"https://doi.org/10.1007/978-1-4614-7052-6_9 

[2] Selbst, Andrew D. and Boyd, Danah and Friedler, Sorelle and Venkatasubramanian,
      Suresh and Vertesi, Janet, "Fairness and Abstraction in Sociotechnical Systems" (August 23, 2018).
      2019 ACM Conference on Fairness, Accountability, and Transparency (FAT*), 59-68, Available at
      `SSRN: 	<https://ssrn.com/abstract=3265913>