# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)

157.0

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


# Question 1: What test is appropriate for this problem? Does CLT apply?

I would use Fisher's Exact Test, which is the gold standard testing tool for 2x2 tables of categorical variables and counts. The 2x2 table here is as follows:



In [17]:
# Turning the data of interest into a 2x2 table of counts

black = data[data.race =='b']
white = data[data.race =='w']

# Find how many call backs for white and black respectively:

black_called = len(black[black['call']== True])
black_nocall = len(black[black['call']==False])


white_called = len(white[white['call']== True])
white_nocall = len(white[white['call']==False])



table = pd.DataFrame({'black':{'called':black_called,'not_called':black_nocall},
                      'white':{'called':white_called,'not_called':white_nocall}})


print("The 2x2 table of counts in this scenario is:")
table

The 2x2 table of counts in this scenario is:


2435

The Central Limit Theorem asserts that averages based on large samples have approximately normal sampling distributions. Let's see how many observations we have in our sample:

In [26]:
# Find out how many samples 

print(len(data))
print(len(black))
print(len(white))

4870
2435
2435


There are 4870 data points, half of which are about black people and half are about white people. Therefore, our sample size is sufficiently large. The resumes were randomly assigned to black-sounding or white-sounding names, so the data are independent, which means that the Central Limit Theorem holds here.

# Question 2: What are the null and alternate hypotheses?

Let $\pi_{bsn}$ be the proportion of applicants with black sounding names who received call backs, and $\pi_{wsn}$ be the proportion of applicants with white sounding names who received call backs. Then the null and alternative hypotheses are as follows:

$H_0$ : $\pi_{bsn}$ = $\pi_{wsn}$

$H_A$ : $\pi_{bsn} \neq \pi_{wsn}$

# Question 3: Compute margin of error, confidence interval, and p-value

Margin of error is calculated as the following:

$ z \sqrt{\frac{p_w(1-p_w)}{n_w} + \frac{p_b(1-p_b)}{n_b}}$

where $z = 1.96$ since we want a 95% confidence interval, $p_w$ is the proportion of applicants with white sounding names who got called back, and $p_b$ is the proportion of applicants with black sounding names who got called back.

In [35]:
# Compute the margin of error:

z = 1.96

p_b = black_called / (len(black))
p_w = white_called / (len(white))

margin_of_error = z * (p_b*(1-p_b)/ len(black) + p_w*(1-p_w)/ len(white))**0.5
print(margin_of_error)

0.015255406349886438


Our margin of error is 0.0152, so now we know that our 95% confidence interval will be:

$p_b$ - $p_w \pm$  Z*margin_of_error

In [47]:
# Calculate confidence interval

print(p_b-p_w)

lower = p_b - p_w - margin_of_error
upper = p_b - p_w + margin_of_error

print(lower, upper)


-0.032032854209445585
-0.047288260559332024 -0.016777447859559147


This tells us that we expect 95% of the data to show the proportion of applicants with black sounding names getting called back at proportions 0.017 and 0.047 below the proportion of applicants with white sounding names getting called back. Notice that 0 does not fall inside of this confidence interval, which suggests that we will ultimately reject the null hypothesis.

But, let's verify by calculating the p-value:

In [46]:
# Calculate the p-value

import scipy.stats as stats 

oddsratio, p_value = stats.fisher_exact(table)

print(p_value)

4.7587471079e-05


This is a tiny p-value, suggesting that we should reject the null hypothesis.

# Question 4: Write a story describing the statistical significance in the context or the original problem.

In this study, we sought to research the levelof racial discrimination in the United States labor market by randomly assigning identical resumes to black-sounding or white-sounding names and looking into how this affected the number of backs from employers. After an investigation of the data, I find that racial discrimination appears to play a statistically significant role in the hiring process for jobs in the United States: applicants with black sounding names were called back 3.2% less often than applicants with white sounding names on identical resumes; based on these data, we would only expect applicants with black sounding names to get the same proportion of callbacks as applicants with white sounding names only 0.0048% of the time - with such a small p-value, we are forced to reject the null hypothesis and think that racial discrimination must play some role. 

# Question 5: Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

No - just because race/name plays a statistically significant factor in callback success, it does not necessarily mean that it is the most important factor. Looking into the role that gender/name as well as the interplay of gender/race/name might have in callback success might also be interesting factors to investigate.