# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

<div class="span5 alert alert-info">
### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Write a story describing the statistical significance in the context of the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
</div>
****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats
import seaborn as sns
import matplotlib.pyplot as pyplot
import math
from __future__ import division
%matplotlib inline

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)

157.0

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [5]:
data.shape

(4870, 65)

In [6]:
#data.call

More initial exploring.

In [7]:
df = data[['race', 'call']]
df.head()

Unnamed: 0,race,call
0,w,0.0
1,w,0.0
2,b,0.0
3,b,0.0
4,w,0.0


In [8]:
blacks = df[df.race == 'b']
whites = df[df.race == 'w']

In [9]:
blacks.head()

Unnamed: 0,race,call
2,b,0.0
3,b,0.0
7,b,0.0
8,b,0.0
9,b,0.0


In [10]:
whites.head()

Unnamed: 0,race,call
0,w,0.0
1,w,0.0
4,w,0.0
5,w,0.0
6,w,0.0


In [11]:
sum(data[data.race=='b'].call)

157.0

In [12]:
len(blacks)

2435

In [13]:
sum(data[data.race=='b'].call) / len(blacks)

0.064476386036960986

Probability that anyone, black or white gets called back.

In [14]:
prob_called = sum(df['call']) / len(df)
prob_called

0.080492813141683772

A little data wrangling below.

In [15]:
black_called = sum(data[data.race=='b'].call)
black_notcalled = len(blacks) - sum(data[data.race=='b'].call)
white_called = sum(data[data.race=='w'].call)
white_notcalled = len(whites) - sum(data[data.race=='w'].call)

Probability of "white" sounding names called back.

In [16]:
prob_white_called = white_called/len(whites)
prob_white_called

0.096509240246406572

Probability of "black" sounding names called back.

In [17]:
prob_black_called = black_called/len(blacks)
prob_black_called

0.064476386036960986

In [18]:
results = pd.DataFrame({'black':{'called':black_called,'not_called':black_notcalled},
                       'white':{'called':white_called,'not_called':white_notcalled}})
results

Unnamed: 0,black,white
called,157.0,235.0
not_called,2278.0,2200.0


### What test is appropriate for this problem? Does CLT apply?

We can do another hypothesis test. The CLT will apply because the data is a normal and large sample (>30).

### What are the null and alternate hypotheses?

H0: There is no impact based on race to determine the number of callbacks.

H1: There is an impact based on race to determine the number of callbacks.

### Compute margin of error, confidence interval, and p-value.

In [19]:
# Get totals from data
total_called = sum(data.call)
total_notcalled = len(data) - sum(data.call)

# Get the expected mean of who is called back, if separated evenly.
total_called / 2

196.0

In [20]:
#Use a chisquare test mentioned on KA with 1 degree of freedom since (#col-1)*(#rows-1) = 1. 
result_freq = [black_called, white_called, black_notcalled, white_notcalled]
expected_freq = [total_called/2,total_called/2,total_notcalled/2,total_notcalled/2]

In [21]:
stats.chisquare(f_obs=result_freq, f_exp = expected_freq, ddof=2)

Power_divergenceResult(statistic=16.879050414270221, pvalue=3.983886837585076e-05)

Our $$\chi^2$$ value is about 16.0 with a p-value of 3.98e-05, which is considered very significant and not due to chance.

In [22]:
# Calculate the standard error 
stderr = np.sqrt(prob_called*(1-prob_called)*(1/len(blacks)+1/len(whites)))
print(stderr)

# Get the z-score
z_score = (prob_white_called - prob_black_called) / stderr
z_score

0.00779689403617


4.1084121524343464

Turn everything into floats. Or, import division from future.

In [23]:
z_score = 1.96

std_err_unpooled = np.sqrt(prob_black_called *(1-prob_black_called)/len(blacks)+ 
                           prob_white_called*(1-prob_white_called)/len(whites))
conf_interval = [prob_white_called-prob_black_called - z_score*stderr,
                 prob_white_called-prob_black_called + z_score*stderr ]
conf_interval

[0.016750941898551489, 0.047314766520339682]

Again, we are confident that there is a 95% chance that the true difference in callback rate for black and white sounding names is within this range.

### Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

Let's look at gender to see if there is any significant difference that can be more closely explained tied to this difference than race.

More data wrangling to split races by gender.

In [24]:
# White females
print "The number of white females in the study is: {}".format(len(whites[data.sex =='f']))

# White males
print "The number of white males in the study is: {}".format(len(whites[data.sex =='m']))

# Black females
print "The number of black females in the study is: {}".format(len(blacks[data.sex =='f']))

# Black males
print "The number of black males in the study is: {}".format(len(blacks[data.sex =='m']))

The number of white females in the study is: 1860
The number of white males in the study is: 575
The number of black females in the study is: 1886
The number of black males in the study is: 549


  from ipykernel import kernelapp as app


In [25]:
# Percentage of white females called back
wf_cb = round( sum(whites[data.sex =='f'].call) / len(whites[data.sex =='f']) * 100, 2)
print wf_cb

# Percentage of white males called back
wm_cb =  round( sum(whites[data.sex == 'm'].call) / len(whites[data.sex =='m']) * 100, 2)
print wm_cb

# Percentage of black females called back
bf_cb =  round( sum(blacks[data.sex =='f'].call) / len(whites[data.sex =='f']) * 100, 2)
print bf_cb

# Percentage of black males called back
bm_cb = round( sum(blacks[data.sex =='m'].call) / len(whites[data.sex =='m']) * 100, 2)
print bm_cb

9.89
8.87
6.72
5.57


  from ipykernel import kernelapp as app


In [26]:
# Percentage difference between white females and white males being called back
print wf_cb - wm_cb

# Percentage difference between black females and black males being called back
print bf_cb - bm_cb

1.02
1.15


### Write a story describing the statistical significance in the context of the original problem.

While the callback rates are higher for females, they are almost uniformly higher in both white and black populations (1.02% and 1.15% respectively). This suggests that a difference in callbacks is not attributed to gender. Obviously, there are several other factors that were part of the resume, including education, work experience, military experience, and speciality skills. It is striking, according to this Poverty Action Lab data, that perceived race and gender difference can improve the callback rate by over 77%.