# Examining Racial Discrimination in the US Job Market

## Background

Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

## Data

In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.


In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('us_job_market_discrimination.dta')

In [25]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)


157.0

In [26]:
# number of callbacks for white-sounding names
sum(data[data.race=='w'].call)

235.0

In [9]:
data[['race','call']].head()

Unnamed: 0,race,call
0,w,0.0
1,w,0.0
2,b,0.0
3,b,0.0
4,w,0.0


In [5]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 4870 entries, 0 to 4869
Data columns (total 65 columns):
id                    4870 non-null object
ad                    4870 non-null object
education             4870 non-null int8
ofjobs                4870 non-null int8
yearsexp              4870 non-null int8
honors                4870 non-null int8
volunteer             4870 non-null int8
military              4870 non-null int8
empholes              4870 non-null int8
occupspecific         4870 non-null int16
occupbroad            4870 non-null int8
workinschool          4870 non-null int8
email                 4870 non-null int8
computerskills        4870 non-null int8
specialskills         4870 non-null int8
firstname             4870 non-null object
sex                   4870 non-null object
race                  4870 non-null object
h                     4870 non-null float32
l                     4870 non-null float32
call                  4870 non-null float32
city        

# Does race has a significant impact on the rate of callbacks for resumes?

n= 4870 which is larger enough to use CLT for this problem. I think the appropriate tests are the z-test and the t-tests. I would use the z-test instead of t-test because the size of the sample is large enough.
 
H0:There is no impact of race on the callbacks for resumes.

H1:There is an impact of race on the callbacks for resumes.

In [10]:
from scipy import stats
from statsmodels.stats import weightstats as stests

In [11]:
df=data.set_index('race')

In [12]:
df_w=df.loc['w']
df_b=df.loc['b']

In [21]:
df_w['call'].describe()

count    2435.000000
mean        0.096509
std         0.295346
min         0.000000
25%         0.000000
50%         0.000000
75%         0.000000
max         1.000000
Name: call, dtype: float64

In [22]:
df_b['call'].describe()

count    2435.000000
mean        0.064476
std         0.245649
min         0.000000
25%         0.000000
50%         0.000000
75%         0.000000
max         1.000000
Name: call, dtype: float64

In [28]:
#the frequentist tests
z_value = 1.96
# white call backs
w_prob = sum(df_w['call']) / len(df_w['call'])
prop_w_call = 235/2435
w_std_err = np.sqrt((w_prob * (1 - w_prob) / len(df_w['call'])))
# black call backs
b_prob = sum(df_b['call']) / len(df_b['call'])
prop_b_call = 157/2435
b_std_err = np.sqrt((b_prob * (1 - b_prob) / len(df_b['call'])))
# Margin of error
std_err_dif = np.sqrt((w_std_err ** 2 + b_std_err ** 2))
margin_err_diff = z_value * std_err_dif
print('Margin of Error:', margin_err_diff)
# confidence interval
prob_diff = w_prob - b_prob
conf_int = (prob_diff - margin_err_diff, prob_diff + margin_err_diff)
print('confidence interval:', conf_int)
# z-test
(ztest ,pval)=stests.ztest(df_w['call'], x2=df_b['call'], value=0,alternative='two-sided')
print('p:',float(pval1))
if pval<0.05:
    print("reject null hypothesis")
else:
    print("accept null hypothesis")

Margin of Error: 0.015255406349886438
confidence interval: (0.016777447859559147, 0.047288260559332024)
p: 3.8767429116085706e-05
reject null hypothesis


In [19]:
#Bootstrap test
def permutation_sample(data1, data2):
    data = np.concatenate((data1,data2))
    permuted_data = np.random.permutation(data)
    perm_sample_1 = permuted_data[:len(data1)]
    perm_sample_2 = permuted_data[len(data1):]
    return perm_sample_1, perm_sample_2
def draw_perm_reps(data_1, data_2, func, size=1):
    perm_replicates = np.empty(size)
    for i in range(size):
        perm_sample_1, perm_sample_2 = permutation_sample(data_1,data_2)
        perm_replicates[i] = func(perm_sample_1,perm_sample_2)
    return perm_replicates
def dif_of_mean(data_1, data_2):
    return np.mean(data_1) - np.mean(data_2)
perm_replicates=draw_perm_reps(df_w['call'],df_b['call'],dif_of_mean,10000)
emp_mean=dif_of_mean(df_w['call'],df_b['call'])
p = np.sum(perm_replicates >= emp_mean) / len(perm_replicates)
print('p: ',p)
conf_int = np.percentile(perm_replicates,[2.5,97.5])
print('confidence intervals:',conf_int)

p:  0.0
confidence intervals: [-0.01560576  0.01560576]


The p-value is very small, so that we can reject the null hypothesis. There is a 3% difference for the number of callbacks between two races. It means there is a relation between the race and the number of callbacks. However, we did not check the association between the other variables and the callbacks. Therefore we cannot conclude the race is the only reason for the difference.