# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

In [8]:
# Import packages
import pandas as pd
import numpy as np
from scipy import stats

In [7]:
# Read data into a pandas dataframe
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [6]:
#Filter dataframe 
w = data[data.race=='w']
b = data[data.race=='b']

# 1. What test is appropriate for this problem? Does CLT apply?

For this project, we want to perform statistical analysis to determine whether or not race has a significant impact on the callback rate for resumes. We will do this by comparing the sample proportions for callbacks for each group, so we will use a two-sample z-test to analyze this problem.

In [27]:
#Does CLT apply?
sample_size_w = len(w)
sample_size_b = len(b)

print('Sample size for white-sounding names:', sample_size_w)
print('Sample size for black-sounding names:', sample_size_b)

Sample size for white-sounding names: 2435
Sample size for black-sounding names: 2435


Both of the sample sizes are sufficiently large enough for CLT to apply. We also have no reason to assume that each resume is not independent of another one.

# 2. What are the null and alternate hypotheses?

The null hypothesis is that race has no effect on whether an applicant receives a callback or not.

The alternative hypothesis is that race does have an effect on whether an applicant receives a callback or not.

# 3. Compute margin of error, confidence interval, and p-value.

In [26]:
#Filter dataframe to capture each applicant who received callback
w_callback = w.call
b_callback = b.call

#Find the mean of each, and their difference
w_callback_mean = np.mean(w_callback)
b_callback_mean = np.mean(b_callback)
w_b_mean_diff = w_callback_mean - b_callback_mean

print('White callback mean:', w_callback_mean)
print('Black callback mean:', b_callback_mean)
print('Difference of means:', w_b_mean_diff)

White callback mean: 0.09650924056768417
Black callback mean: 0.0644763857126236
Difference of means: 0.03203285485506058


In [37]:
#Calculate the proportion of callbacks for black and white applicants
prop_w = np.sum(w_callback) / sample_size_w
prop_b = np.sum(b_callback) / sample_size_b

#Calculate the difference in sample proportions
prop_diff = prop_w - prop_b

#Calculate p-hat
phat = (np.sum(w.call) + np.sum(b.call)) / (sample_size_w + sample_size_b)

#Z-score
z = prop_diff / np.sqrt(phat * (1 - phat) * ((1 / sample_size_w) + (1 / sample_size_b)))
            
#Calculate p-value
p_value = stats.norm.cdf(-z) * 2

print('Z-score:', z)
print('p-value:', p_value)

#Calculate margin of error and confidence interval
moe = 1.96 * np.sqrt(phat * (1 - phat) * ((1 / sample_size_w) + (1 / sample_size_b)))
ci = prop_diff + np.array([-1, 1]) * moe

print('\nMargin of error:', moe)
print('Confidence interval:', ci)

Z-score: 4.108412152434346
p-value: 3.983886837585077e-05

Margin of error: 0.015281912310894095
Confidence interval: [0.01675094 0.04731477]


# 4. Write a story describing the statistical significance in the context or the original problem.

What our calculations tell us is that, with 95% confidence, we can say that applicants with white-sounding names get callbacks between 1.6% and 4.7% more often than applicants with black-sounding names.

Our p-value represents the probability that what was observed is due to random chance. Since the p-value is significantly lower than our signifiance level (0.05), we can reject the null hypothesis that race has no effect on whether or not an applicant receives a callback.

# 5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

Our analysis does indicate that race/name is a very important factor (With high confidence). But what our analysis does not tell us is that race/name is the *most* important factor in callback success. In order to determine that, we would likely need some other sort of statistical tool such as a linear regression to analyze what the most important factor is.