# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

In [1]:
import pandas as pd
import numpy as np
import scipy.stats as st
import statsmodels.stats.api as sms

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


# 1. What test is appropriate for this problem? Does CLT apply?

In [4]:
print('There are {} records in this sample.'.format(len(data)))
print('There are {} records for \'b\' race this sample.'.format(len(data[data['race']=='b'])))
print('There are {} records for \'w\' race this sample.'.format(len(data[data['race']=='w'])))

There are 4870 records in this sample.
There are 2435 records for 'b' race this sample.
There are 2435 records for 'w' race this sample.


CLT applies here because there is a sufficiently large sample size from a population with a finite level of variance. The mean of all samples will be approximately equal to the mean of the population. Since we are testing two different samples, a two-sample t-test is most appropriate.

# 2. What are the null and alternate hypotheses?

H<sub>0</sub>: $M$<sub>b calls</sub> - $M$<sub>w calls</sub> = 0  
H<sub>1</sub>: $M$<sub>b calls</sub> - $M$<sub>w calls</sub> ≠ 0

# 3. Compute margin of error, confidence interval, and p-value.

In [5]:
data_b = data[data['race'] == 'b']
data_w = data[data['race'] == 'w']

print('The sample mean for \'b\' race callbacks is {:.2f}.'.format(data_b['call'].mean()))
print('The sample mean for \'w\' race callbacks is {:.2f}.'.format(data_w['call'].mean()))
print('The sample mean difference is {:.3f}.'.format(data_b['call'].mean() - data_w['call'].mean()))

The sample mean for 'b' race callbacks is 0.06.
The sample mean for 'w' race callbacks is 0.10.
The sample mean difference is -0.032.


In [6]:
two_sample = st.ttest_ind(data_b['call'], data_w['call'])
cm = sms.CompareMeans(sms.DescrStatsW(data_b['call']), sms.DescrStatsW(data_w['call']))

print('The t-statistic is %.3f and the p-value is %.6f.' % two_sample)
print('The 95% confidence interval about the mean difference is ({:.3f}, {:.3f}).'.format(cm.tconfint_diff(usevar='unequal')[0],
                                                                                          cm.tconfint_diff(usevar='unequal')[1]))
print('The margin of error is {:.3f}.'.format((data_b['call'].mean() - data_w['call'].mean()) 
                                              - cm.tconfint_diff(usevar='unequal')[0]))

The t-statistic is -4.115 and the p-value is 0.000039.
The 95% confidence interval about the mean difference is (-0.047, -0.017).
The margin of error is 0.015.


# 4. Write a story describing the statistical significance in the context or the original problem.

Given the p-value, test statistic, and 95% confidence interval, there is a statitistically significant difference between the number of calls a black sounding name receives and that of a white sounding name. We are 95% confident that white sounding names receive more between 0.047 and 0.017 more calls than black sounding names. This result, however, is not practically significant and it seems as though the true difference in calls is not completely explained by race alone.

# 5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

The analysis indicates that race/name is somewhat of a factor in callback success, but it is not a large enough factor such that it completely describes why certain candidates receive more callbacks than others. There may be a larger, underlying trend in the other features in the dataset such as education, years of experience, and previous jobs just to name a few. To more fully describe this phenomenon I would explore the differences in the others features with respect to race.