# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

In [1]:
%matplotlib inline
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
import pylab as pl
import math

In [4]:
os.chdir('C:\\LUML\\Springboard\\EDA\\racial_disc')
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [7]:
data.describe()



Unnamed: 0,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,occupbroad,workinschool,...,educreq,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind
count,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,...,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0,4870.0
mean,3.61848,3.661396,7.842916,0.052772,0.411499,0.097125,0.448049,215.637782,3.48152,0.559548,...,0.106776,0.437166,0.07269,0.082957,0.03039,0.08501,0.213963,0.267762,0.154825,0.165092
std,0.714997,1.219126,5.044612,0.223601,0.492156,0.296159,0.497345,148.127551,2.038036,0.496492,...,0.308866,0.496083,0.259649,0.275854,0.171677,0.278932,0.410141,0.442847,0.361773,0.371308
min,0.0,1.0,1.0,0.0,0.0,0.0,0.0,7.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,3.0,3.0,5.0,0.0,0.0,0.0,0.0,27.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,4.0,4.0,6.0,0.0,0.0,0.0,0.0,267.0,4.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,4.0,4.0,9.0,0.0,1.0,0.0,1.0,313.0,6.0,1.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
max,4.0,7.0,44.0,1.0,1.0,1.0,1.0,903.0,6.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [10]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [11]:
len(data)

4870

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)

157.0

In [12]:
# number of callbacks for white-sounding names
sum(data[data.race=='w'].call)

235.0

In [23]:
black_resumes = data[data.race == 'b'].call.to_frame()
white_resumes = data[data.race == 'w'].call.to_frame()

In [24]:
black_resumes.describe()

Unnamed: 0,call
count,2435.0
mean,0.064476
std,0.245649
min,0.0
25%,0.0
50%,0.0
75%,0.0
max,1.0


In [25]:
white_resumes.describe()

Unnamed: 0,call
count,2435.0
mean,0.096509
std,0.295346
min,0.0
25%,0.0
50%,0.0
75%,0.0
max,1.0


# What test is appropriate for this problem? Does CLT apply?

In [26]:
# # we have categorical and continuous variables 
# # black resumes - got a call
# # black resumes - did not get a call
# # white resumes - got a call
# # white resumes -did not get a call
# # so we need to use Chi-square test to solve this problem

# What are the null and alternate hypotheses?

In [45]:
# Null hypothesis - race should have not any effect on the number of calls
# Alternate hypothesis - race has effect on the number of calls
b_called = len(black_resumes[black_resumes['call'] == 1.0])
b_not_called = len(black_resumes[black_resumes['call'] == 0.0])
w_called = len(white_resumes[white_resumes['call'] == 1.0])
w_not_called = len(white_resumes[white_resumes['call'] == 0.0])

In [58]:
# find out observed  values
observed = pd.DataFrame({'black-resumes': {'called': b_called, 'not-called': b_not_called},
                         'white-resumes': {'called' : w_called, 'not-called' : w_not_called}})

In [59]:
observed

Unnamed: 0,black-resumes,white-resumes
called,157,235
not-called,2278,2200


In [60]:
# number of black resumes - 2435
# number of white resumes - 2435 
# so the data distribution of resumes across race are equal.
# From this data, we can clearly see calls for white resumes are higher than black resumes.
# we need to find out if the difference in the number of calls is statistically significant or 
# is it just by random chance
# we will use chi-square test to find out if the difference is statistically significant

In [68]:
# find out expected values
total_called = b_called + w_called
total_not_called = b_not_called + w_not_called
print(total_called)
print(total_not_called)

392
4478


In [69]:
call_back_rate = total_called/total_not_called
print(call_back_rate)

0.08753907994640464


In [70]:
e_called = len(data) / 2 * call_back_rate
e_not_called = len(data) / 2 * (1 - call_back_rate)
print(expected_called)
print(expected_not_called)

213.1576596694953
2221.8423403305046


In [73]:
# we need to construct arrays for observed and expected as stats.chisquare needs inputs for observed
# expected in the form of arrays
observed_values = [b_not_called, w_not_called, w_called, b_called]
expected_values = [e_not_called, e_not_called, e_called, e_called]

dof = 1 # (no of columns - 1) * (no of rows - 1) in observed table
# ddof = k-1-dof delta degrees of freedom and k is no of observed frequencies
# so ddof = 4 - 1 - 1 = 2

stats.chisquare(f_obs = observed_values, f_exp = expected_values, ddof=2)

Power_divergenceResult(statistic=18.667389542060633, pvalue=1.5562141908802767e-05)

In [74]:
# the p-value is very small 1.55e-05 so we reject null hypothesis. It means race plays a significant
# in the number of call backs

# Find out margin of error, confidence Interval and p-value

In [75]:
#Another way to solve the problem
# Null hypothesis: Both groups are called back at the same rate. 
# Alternative Hypothesis: Whites are called back at a higher rate.
# two-tailed test of proportions

l1 = len(white_resumes)
l2 = len(black_resumes)
p1 = w_called / l1
p2 = b_called / l2

print(p1)
print(p2)

p = (p1 * l1 + p2 * l2) / (l1 + l2)
stderr = np.sqrt(p * (1 - p) * ((1 / l1) + (1 / l2)))
stderr

z = (p1 - p2) / stderr
z

0.09650924024640657
0.06447638603696099


4.1084121524343464

In [76]:
p_values = stats.norm.sf(abs(z))*2 #twosided
p_values

3.9838868375850767e-05

In [77]:
# The p-value is less than 0.05  so we can reject the null hypothesis. 
# So there is an evidence that the rate of callbacks for white resumes are higher than black resumes

In [79]:
#confidence interval

p = p1 - p2
stderr = np.sqrt(p1 * (1 - p1) / l1) + (p2 * (1 - p2) / l2)
z = 1.96
confidence_interval = [p - z * stderr, p + z * stderr]
confidence_interval

[0.020255520134115839, 0.043810188284775328]

In [None]:
# 95% confidence interval
# We are 95% confident that the difference in the proportion of callbacks for blacks and whites is between 0.02 and 0.0438. 
# Since the confidence interval does not contain 0, we can conclude that there is a difference in the 
# callback rate for blacks and whites