# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

In [2]:
import pandas as pd
import numpy as np
from scipy import stats

In [3]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [4]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [5]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


To start, the value for input 4 is incorrect. The value for callbacks of black-sounding names is being found using the variables for white-sounding names. This is not the value we will use below for the number of callbacks for black-sounding names.

We want to compare the mean number of callbacks from white-sounding names to black-sounding names. We will do a t-test to compare. Although there are more than 30 values we are comparing one dataset to another and neither one is a null set. We want our null hypothesis to be that the mean between the two racial-souding names is the same. The alternative hypothesis will be that the mean between the two sets of data is significantly different.

In [6]:
w = data[data.race=='w']
b = data[data.race=='b']

In [7]:
data['race'].value_counts()

w    2435
b    2435
Name: race, dtype: int64

In [8]:
data['call'].value_counts()

0.0    4478
1.0     392
Name: call, dtype: int64

In [9]:
call_w = sum(data[data.race=='w'].call)
call_w

235.0

In [10]:
call_b = sum(data[data.race=='b'].call)
call_b

157.0

In [11]:
prop_w = call_w/2435
prop_w

0.09650924024640657

In [12]:
prop_b = call_b/2435
prop_b

0.06447638603696099

In [13]:
t = stats.ttest_ind(w['call'], b['call'])
t

Ttest_indResult(statistic=4.114705290861751, pvalue=3.940802103128886e-05)

The small p-value indicates that we can reject the null hypothesis that the means between the sets would be the same. The means between the two sets are actually significantly different.

We will now calculate the confidence interval.

In [14]:
import statsmodels.stats.api as sms

  from pandas.core import datetools


In [15]:
conf_int = sms.CompareMeans(sms.DescrStatsW(w['call']), sms.DescrStatsW(b['call']))
conf_int.tconfint_diff(usevar='unequal')

(0.016770673983991798, 0.04729503443489937)

Our p-value is smaller than out confidence interval. This is another indicator that our null hypothesis can be rejected. 

Now that we can see our initial hypothesis of the rates of callbacks being the same is incorrect, we can look closer at the ratio of callbacks to white-sounding names to black-sounding names. We can tell from the result immediately above that white-sounding names receive callbacks approximately 1.5 times as often as black-sounding names. 

Before we get to a racial-based conclusion, let's look at some other factors influencing employment like education, number of jobs previously held, and years of experience. We will look at a list of all column names to see what else would be interesting to add to the comparison.

In [17]:
list(data)

['id',
 'ad',
 'education',
 'ofjobs',
 'yearsexp',
 'honors',
 'volunteer',
 'military',
 'empholes',
 'occupspecific',
 'occupbroad',
 'workinschool',
 'email',
 'computerskills',
 'specialskills',
 'firstname',
 'sex',
 'race',
 'h',
 'l',
 'call',
 'city',
 'kind',
 'adid',
 'fracblack',
 'fracwhite',
 'lmedhhinc',
 'fracdropout',
 'fraccolp',
 'linc',
 'col',
 'expminreq',
 'schoolreq',
 'eoe',
 'parent_sales',
 'parent_emp',
 'branch_sales',
 'branch_emp',
 'fed',
 'fracblack_empzip',
 'fracwhite_empzip',
 'lmedhhinc_empzip',
 'fracdropout_empzip',
 'fraccolp_empzip',
 'linc_empzip',
 'manager',
 'supervisor',
 'secretary',
 'offsupport',
 'salesrep',
 'retailsales',
 'req',
 'expreq',
 'comreq',
 'educreq',
 'compreq',
 'orgreq',
 'manuf',
 'transcom',
 'bankreal',
 'trade',
 'busservice',
 'othservice',
 'missind',
 'ownership']

In [18]:
data_call = data[data.call==1]
exp_call = data_call[['education', 'ofjobs', 'yearsexp', 'military', 'race', 'sex', 'call']]
exp_call.head()

Unnamed: 0,education,ofjobs,yearsexp,military,race,sex,call
85,2,3,7,0,w,m,1.0
95,2,3,4,0,w,m,1.0
105,4,2,6,0,w,f,1.0
107,4,3,6,0,b,f,1.0
126,4,1,9,0,b,f,1.0


In [19]:
call_w, call_b

(235.0, 157.0)

Let's also determine how many applications belonged to males and females each.

In [21]:
call_m = sum(data[data.sex=='m'].call)
call_f = sum(data[data.sex=='f'].call)
call_m, call_f

(83.0, 309.0)

That's an interesting proportion so we'll count how many male and female applicants there were in the original set. 

In [22]:
data['sex'].value_counts()

f    3746
m    1124
Name: sex, dtype: int64

In [24]:
prop_f = call_f/3746
prop_m = call_m/1124
prop_f, prop_m, prop_f/prop_m

(0.08248798718633209, 0.07384341637010676, 1.117066236113702)

More females than men received a callback. Recall the prop_b and prop_w from earlier.

In [25]:
prop_w, prop_b, prop_w/prop_b

(0.09650924024640657, 0.06447638603696099, 1.4968152866242037)

In [26]:
data_no_call = data[data.call==0]
exp_no_call = data_no_call[['education', 'ofjobs', 'yearsexp', 'military', 'race', 'sex', 'call']]
exp_no_call.head()

Unnamed: 0,education,ofjobs,yearsexp,military,race,sex,call
0,4,2,6,0,w,f,0.0
1,3,3,6,1,w,f,0.0
2,4,1,6,0,b,f,0.0
3,3,4,6,0,b,f,0.0
4,3,3,22,0,w,f,0.0


We can see in the small dataframe above that the five shown applicants that received no calls are female. We can see in the description tables below that the race and sex variables are not included in the breakdown since they are string values.

In [27]:
exp_call.describe()

Unnamed: 0,education,ofjobs,yearsexp,military,call
count,392.0,392.0,392.0,392.0,392.0
mean,3.604592,3.670918,8.890306,0.076531,1.0
std,0.711095,1.313464,5.535351,0.266185,0.0
min,0.0,1.0,1.0,0.0,1.0
25%,3.0,3.0,5.0,0.0,1.0
50%,4.0,4.0,7.0,0.0,1.0
75%,4.0,5.0,11.0,0.0,1.0
max,4.0,7.0,26.0,1.0,1.0


In [28]:
exp_no_call.describe()

Unnamed: 0,education,ofjobs,yearsexp,military,call
count,4478.0,4478.0,4478.0,4478.0,4478.0
mean,3.619696,3.660563,7.751228,0.098928,0.0
std,0.715404,1.210672,4.989577,0.298599,0.0
min,0.0,1.0,1.0,0.0,0.0
25%,3.0,3.0,5.0,0.0,0.0
50%,4.0,4.0,6.0,0.0,0.0
75%,4.0,4.0,9.0,0.0,0.0
max,4.0,7.0,44.0,1.0,0.0


We can see what appears to be an outlier in the years of experience of exp_no_call. An applicant with 44 years prior experience was not given a callback. By any measure, they'd have the qualifications for a job. So we'll check the row for that particular entry to see if there is anything of interest.

In [29]:
exp_no_call[exp_no_call.yearsexp==44]

Unnamed: 0,education,ofjobs,yearsexp,military,race,sex,call
1804,4,5,44,0,b,f,0.0


Turns out the applicant did have a black-sounding name and was female, which is interesting and we'll take it into consideration in our response. Beyond discrimination, the case could also be that the applicant was overqualified for the position applied for. We see another interesting data point in exp_call that has years of experience as 26.

In [30]:
exp_call[exp_call.yearsexp==26]

Unnamed: 0,education,ofjobs,yearsexp,military,race,sex,call
576,3,5,26,0,b,f,1.0
1547,3,5,26,0,w,f,1.0
1695,3,5,26,0,b,f,1.0
2201,3,5,26,0,b,f,1.0
2221,3,5,26,0,w,f,1.0
2541,3,5,26,0,b,f,1.0
2738,3,5,26,0,b,f,1.0
3248,4,5,26,0,w,m,1.0
3491,4,5,26,0,w,f,1.0
4401,4,5,26,0,b,f,1.0


Interesting to note that most of the applicants were female with 3 or 4 years of education and they all had previously held 5 jobs. None of them have military experience. There are 7 black-sounding names and 4 white-sounding. I don't think racially sounding names played a facor in the callbacks of these experienced applicants since they all seem to have highly similar educational and experiential backgrounds. 

In the description tables, the race and sex columns are not included since they are string values. We will use the proportion variables to discuss those columns.

We can see that, between the call and no_call descritpion tables, the mean education values are both 3.6, the mean ofjobs values is 3.7, the years of experience appears to be one greater for the people who received callbacks than the people who did not and fewer ex-military personnel received callbacks than the ex-military personnel who did not. We can see from the proportional calculations that, proportionally, more females received callbacks than males and more white-sounding names received callbacks than black-sounding names. 

For these other factors, I would not conclude that there was ethnic discrimination between the callbacks. All other things being the same, there were also great variances in the callbacks for each sex. Other things to look at would be age and the states in particular. Some states may have higher rates of racial discrimination than others. For this reason I would not conclude that race is the 'most important factor in callback success'. There are other equally relevant factors.