# SLU17 - Ethics & Fairness - Learning notebook

When you work as a data scientist, the predictions of your model have consequences in the real world on real people.  The model predictions are influenced by the data that is used for training, by the variables we choose to include, by the metrics we optimize against. This notebook explains how these choices might build in bias and unfairness into the predictions and how to avoid it.

### Table of Contents

[1. Components of a learning system](#1.-Components-of-a-learning-system)   
[2. Privacy by default](#2.-Privacy-by-default)   
[3. Bias and fairness](#3.-Bias-and-fairness)

## 1. Components of a learning system

We can think of learning systems as circular loops, i.e., learning loops, influenced and influencing society.

![components-of-a-learning-system](./media/components-of-a-learning-system.png)

*Fig. 1: Components of a learning system*

We examine its constituent parts.

**State of the world**   
In short, "State of World" stands for **reality**, including all the disparities, biases, and general messiness out there.

**Measurement**   
Measurement is the process of **reducing** the World's messiness to a neat set of values or measurements.

**Data**   
A **representation** of the object of interest to provide the model with a useful vantage point.

**Annotation**    
The act of manually **labeling** the data to generate new features or for supervised learning.

**Learning**   
**Training** a general(izable) model based on the available data.

**Model**   
The set of **rules** or operations, previously learned from the available data, to make or inform decisions.

**Product**   
The **experience** that encapsulates the model to create value, or benefit.

**Action**   
The **effect** or outcome delivered by the product based on the model results.

**Feedback**   
The **perception** of the action. It can be **explicit**, if given directly, or **implicit**, when inferred from the response.

**Context**   
Many times overlooked, the context is the **setting** in which the product is being used by the user.

**Users**   
Self-explanatory, the end-user is someone who **uses** or is intended to use a product or service.

**Back to the beginning!**   
Since users, as people, are part of the World, **actions directly affect the state of the World**.

Moreover, there are, often times, indirect effects through the user. Think [deepfakes](https://en.wikipedia.org/wiki/Deepfake), for example.

## 2. Privacy by default

The default behavior of a Data Scientist should be the most privacy friendly.

Privacy means that we should be most careful about personal data, respecting and protecting it, by design, at all times.

### 2.1 Personal data

Personal data is **any specific information relating to an identifiable person**.

Things that can be used to identify an individual include:
* Name
* Location
* Physical, physiological, mental information
* Genetic and biometric data
* Economic or cultural characteristic.

Sensitive information (e.g., ethnicity, gender, political opinions and religious beliefs) requires a higher-level of scrutiny than general personal data.

### 2.2 Data collection

The central theme in data collection is **informed consent** and it applies to the purpose the data is being collected for. 

Additionally, it must be **limited to relevant data** for the task at hand, particularly when it concerns personal data.

Finally, the data should be accurate and updated (otherwise you should discard it, seriously).

If you are using data that was not collected by you, you should know where it comes from, how reliable it is, and for which purposes it was originally created.

A recent infamous example is the Boston housing dataset which was part of several modelling packages like sklearn. This data set contained built-in racism and discrimination. Moreover, it was used as a benchmarking model to test the performance of predictive models and so propagating the built-in biases, although its original use was for an explanatory model. See the links in further reading if you'd like to learn more.

### 2.3 Data storage

Once we collect data, we are responsible for protecting it and, ultimately, our users and/or subjects of interest.

We need a plan to **secure and protect the data** against unintended use. It should be defined who has access to the data. Only the people who work with the data should have access to it.

It's also important that the users have the ability to access, rectify, and erase their personal data (aka right to be forgotten).

### 2.4 Processing

The users should be able to **restrict the processing of their data**, if they so choose.

Personal information should not be used (much less displayed) unless absolutely necessary. Ideally, the data should be anonymized. The personal data should be stored in separate tables and connected with the rest of the data by an identifier. The access to the personal data can also be restricted to a smaller group of people.

We must ensure, at all times, honest representation of our subjects, in line with the underlying data.

All the processing should be auditable, properly documented and reproducible.

Finally, old and/or unnecessary data should be periodically deleted, according to a data retention plan.

### 2.5 Modeling

Once a model is in place, we must be vigilant and ready to react at all times.

We need to be able to evaluate the model in regards to its effect on users, performance deterioration and unintended use.

Given the results, we should be able to roll-back a model if we need to.

## 3. Bias and fairness

When evaluating a model a responsible Data Scientist should do more than calculating a loss metric.

Fairness implies fair predictions for different subgroups, e.g., different demographics. For that we need to:
* Audit the training data
* Evaluate predictions in a way that is fairness aware.

As you explore your data to figure out the best representation, it's important to proactively audit for potential sources of bias. 

### 3.1 Biases in training data

As we’ve seen, the World isn’t without bias: representing it and measuring it with good accuracy would only incorporate the existing disparities. 

The subjectivity associated with measurement introduces new biases, on top of the unavoidable, worldly ones. 

#### Reporting bias

Reporting bias, for example, occurs when the frequency of events is poorly measured. 

Often the case with online reviews: since people are more likely to submit reviews when they respond very strongly to the experience, i.e., love or hate. 

It is common to have artificially imbalanced data towards the extremes and under-represent the ordinary that “goes without saying”.

#### Selection bias

Selection bias arises when the examples in the data aren’t reflective of the real-world distribution. 

##### Coverage bias

You can fail to cover part of the population in your sample, known as coverage bias. 

When you survey a customer base for customer satisfaction but fail to include past customers, that’s coverage bias. 

##### Participation bias

Another type of selection bias is participation bias and it happens when a segment of the population is underrepresented, due to lack of responses. 

If customer satisfaction surveys are sent to a representative sample but response rates are lower for past customers, leading, once again, to an imbalanced sample. 

##### Sampling bias

The last case of selection bias is sampling bias, that arises due to a lack of proper randomization. 

Imagine that you use only the first few responses when doing the customer satisfaction survey above, representing only the most engaged customers.

#### Automation bias

Automation bias favors results coming from automation, regardless of error rates and ignoring contradictory information. 

So, one might prefer a decision from an automated system, even if flawed, to more accurate expert knowledge.

#### Group attribution

In-group bias occurs when you give preference to members of a group you belong to. 

Out-group bias is the tendency to stereotype individual members of a different group. 

This bias has been reinforcing the glass ceiling for many demographics. The hiring managers preferred people of their own kind, typically white male.

#### Implicit bias

Implicit bias derives from assumptions made from one’s own experiences, which may or may not generalize.

#### Confirmation bias

Confirmation bias manifests itself as the propensity to affirm all or some of the preexisting beliefs and hypothesis of the experimenter.

### 3.2 Evaluating predictions

Metrics calculated against an entire test set don't give an accurate picture of how fair the model is.

Rather, metrics can and should be evaluated separately for different groups of interest, particularly when dealing with sensitive information.

We will go into specific examples in the exercises!

## 4. Interesting links

1. [Privacy not included](https://foundation.mozilla.org/en/privacynotincluded/) initiative from Mozilla
2. [Revisiting the Boston housing dataset](https://fairlearn.org/main/user_guide/datasets/boston_housing_data.html) from the fairlearn project