# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

<div class="span5 alert alert-info">
### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution
</div>
****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [6]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


<div class="span5 alert alert-success">
<p>Your answers to Q1 and Q2 here</p>
</div>

In [17]:
w = data[data.race=='w'][['race','call']]
w_call=w[w['call']==1]
b = data[data.race=='b'][['race','call']]
b_call=b[b['call']==1]
p_white=len(w_call)/len(w)
p_black=len(b_call)/len(b)
print("The probability of getting a call back among "+str(len(w))+" white people is "+str(p_white))
print("The probability of getting a call back among "+str(len(w))+" black people is "+str(p_black))

The probability of getting a call back among 2435 white people is 0.09650924024640657
The probability of getting a call back among 2435 black people is 0.06447638603696099


#### Q1. 
A chi-square test is appropriate to examine if various categories (two race groups in this case) have significant difference in some outcomes (get a interview call). Besides, a two-sample proportion z-test can also be used in this problem. CLT does apply since the samples are randomly selected, independent and with adequate counts np>10 & n(1-p)>10.
#### Q2. 
Null Hypothesis: the probability of getting a job intervew call is the same for a white person and for a black person;


Alternative Hypothesis: the probability of getting a job intervew call is the different for a white person and for a black person

#### Q3.
First try the frequentist approach:

A 95% confidence interval of the two-sample difference would have a z value of 1.96.

In [22]:
SE=np.sqrt(p_white*(1-p_white)/len(w)+p_black*(1-p_black)/len(b))
margin=1.96*SE
CI=(round(p_white-p_black-margin,4),round(p_white-p_black+margin,4))
print("The confidence interval of the difference of call-back proportions among white and black persons are "+str(CI))

The confidence interval of the difference of call-back proportions among white and black persons are (0.0168, 0.0473)


In [32]:
from statsmodels.stats.proportion import proportions_ztest
count=np.array([len(w_call),len(b_call)])
obs=np.array([len(w), len(b)])
stat, pval=proportions_ztest(count, obs)
print(pval)

3.983886837585077e-05


Since 0 is not within 95% CI, we can conclude that we are 95% confident that the true mean proportion difference of call-back between white and black persons are not zero; also, the p-value of the hypothesis test is almost zero, so we can reject the hypothesis that the mean proportion difference is the same for these two race groups.

Now use a bootstrap approach:

In [49]:
def bootstrap_replicate_1d(data, func):
    return func(np.random.choice(data, size=len(data)))

def draw_bs_reps(data, func, size=1):
    """Draw bootstrap replicates."""
    bs_replicates = np.empty(size)
    for i in range(size):
        bs_replicates[i] = bootstrap_replicate_1d(data,func)
    return bs_replicates

bs_replicates_w = draw_bs_reps(np.array(w['call']),np.mean,10000)
bs_replicates_b = draw_bs_reps(np.array(b['call']),np.mean,10000)
bs_diff=bs_replicates_w-bs_replicates_b
std=np.std(bs_diff)
margin=1.96*std
CI=(round(np.mean(bs_diff)-margin,4),round(np.mean(bs_diff)+margin,4))
print("The confidence interval is "+str(CI))

The confidence interval is (0.0168, 0.0473)


In [53]:
def diff_of_means(data_1, data_2):
    """Difference in means of two arrays."""
    diff = np.mean(data_1)-np.mean(data_2)
    return diff

empirical_diff_means =  p_white-p_black
mean_call=np.mean(data['call'])

w_shifted = w['call'] - np.mean(w['call']) + mean_call
b_shifted = b['call'] - np.mean(b['call']) + mean_call

bs_replicates_w = draw_bs_reps(w_shifted, np.mean, size=10000)
bs_replicates_b = draw_bs_reps(b_shifted, np.mean, size=10000)
bs_replicates = bs_replicates_w - bs_replicates_b

p = np.sum(bs_replicates >= empirical_diff_means) / len(bs_replicates)
print('p-value =', p)

p-value = 0.0


<div class="span5 alert alert-success">
<p> Your answers to Q4 and Q5 here </p>
</div>

#### Q4
Given that the 95 confidence internal of the difference of call-back proportions between white and black groups doesn't cover zero, and both the p-values of the z-test and bootstrap test are less than 0.05; we can conclude that the true proportion difference of call-back between white and black groups is statistically significant.
#### Q5
My result doens't mean that race/name is the most important factor in callback success. I only get to the conclusion that white and black people, without considering other factors, have difference in their call-back probabilities. There might be other factors that contribute to the difference in call-back rates, such as gender, years of experience, education, etc. All of these factors were not examined so we cannot say they are not more important than race/name.