# MS&E 330: Law, Order, and Algorithms
## Tests for discrimination (outcome and threshold tests)

In [9]:
options(digits = 3)

library(tidyverse)

stop_df <- read_rds("../data/nc_sample.rds")
population <- read_rds("../data/nc_population.rds")

theme_set(theme_bw())

### The data

The loaded data frame `stop_df` is a sample of traffic stops in North Carolina. 
Below is a list of columns in the `stop_df` data:

* Base information regarding stop:
    * `id`, `stop_date`, `stop_time`, 
      `police_department`, `county_name`, 
      `officer_id`
* Circumstances which led to stop:
    * `violation`   
* Suspect demographics:
    * `driver_age`, `driver_gender`, `driver_race`
* Was person searched?: `searched_conducted`,
    * if yes: `search_basis`, `search_type`
* Was contraband found?: `contraband_found`     
* Stop outcome: `stop_outcome`
      
The `population` data frame contains statewide populations by race, using the US 
Census Bureau's ACS 5-year estimates (2012-2016). There are just two columns in
this data frame: `race` and `population`.

### Introduction

A seemingly straightforward way to check if ... (todo: finish intro)

Let's take state highway patrol stops in North Carolina as a case study. 
First, let's check stop counts by race.

In [8]:
# WRITE CODE HERE
# START solution
stop_df %>% count(driver_race)
# START solution

driver_race,n
Asian,9608
Black,257671
Hispanic,72363
Other,31842
White,628516


We need to put these counts in context in order to understand whether black
individuals are being stopped dispropotionately. An initial check could be
the base population. Try computing the per-capita stop rates by race:

In [11]:
# WRITE CODE HERE
# START solution
stop_df %>% 
  count(driver_race) %>% 
  left_join(
    population,
    by = c("driver_race" = "race")
  ) %>% 
  mutate(stop_rate = n / population)
# END solution

driver_race,n,population,stop_rate
Asian,9608,258118,0.0372
Black,257671,2104597,0.1224
Hispanic,72363,884763,0.0818
Other,31842,534587,0.0596
White,628516,6361438,0.0988


Clearly black and Hispanic individuals are stopped at a much higher rate than
white individuals relative to the residential population in North Carolina. However, the
argument could be made that black individuals are stopped more often then white individuals 
because they exhibit suspicious behavior at higher rates than white individuals. The implication of 
this statement is that instead of comparing the racial distribution of stops to the base racial 
distribution of population, we should be comparing (or "benchmarking") to the underlying racial 
distribution of suspicious behavior being exhibited. 

However, the underlying distribution of who is exhibiting suspicious behavior is nearly impossible 
to obtain. Some common proxies include benchmarking to arrests
or violent arrests. However, those are both problematic for a variety of reasons.

**Discuss with a partner:** Why might using arrests or violent arrests be problematic? What benefits, drawbacks, and assumptions come with each of these two benchmarks?

### The outcome test

The outcome test, first proposed by Gary Becker in 1957, gets around the "benchmarking problem" by focusing on differences in stop outcome (for example, recovery of weapons), and checkking if outcomes differ systematically between whites and minorities. If stopped minorities are less likely than whites to have weapons, it suggests that the bar for stopping minorities is lower than the bar for stopping whites. That lower bar is de facto descrimination.

**Discuss with a partner:** How could the outcome test be applied to test discrimination/bias in the case of lending decisions? Bail bond-setting decisions? Editorial acceptance decisions? What would evidence of bias against minorities look like in each of these three decision-making arenas?

Jumping back to our North Carolina data, to apply 
