# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for white-sounding names
sum(data[data.race=='w'].call)

235.0

In [4]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)

157.0

In [5]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


<div class="span5 alert alert-success">
<p> For this problem, CLT applies because the sample size is large enough and randomly sampled. <br>
First, I will do a Chi-squared test for correlation between the two categorical variables, race and call. <br>
$H_{0}$: Whether or not receiving a call is completely uncorrelated with race <br>
$H_{1}$: Whether or not receiving a call is correlated with race</p>
</div>



In [6]:
# Create a contingency table
table = pd.crosstab(data.call, data.race, margins=True)
table.columns = ['black_sounding', 'white_sounding', 'total']
table.index = ['no_call', 'call', 'total']
table

Unnamed: 0,black_sounding,white_sounding,total
no_call,2278,2200,4478
call,157,235,392
total,2435,2435,4870


In [7]:
# Chi-squared test
chi_test = stats.chi2_contingency(table)
print('Chi-squared statistic is {}, p-value is {}'.format(chi_test[0], chi_test[1]))

Chi-squared statistic is 16.87905041427022, p-value is 0.0020403793672093755


From the result of the chi-squared test, we reject the null hypothesis, and get the conclusion that race and callbacks are correlated. 

<div class="span5 alert alert-success">
<p> Second, I will test the difference in two population proportions using z-test. </p>
</div>

We'll test: <br>
$H_{0}: p_{1} = p_{2}$ <br>
$H_{1}: p_{1} \neq p_{2}$ <br>
where $p_{1}$ is the proportion of the white-sounding names population who received callbacks, $p_{2}$ is the proportion of the black-sounding names population who received callbacks. <br>
The test statistic for testing the difference in two populatino proportions is <br>
\begin{equation}Z = \frac{\hat{p_{1}} - \hat{p_{2}}}{\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_{1}} + \frac{1}{n_{2}}})}
\end{equation}
where $\hat{p}$ is the proportion of "successes" in the two samples combined.

In [8]:
w = data[data.race=='w']
b = data[data.race=='b']

In [9]:
alpha = 0.05 

n1 = len(w)
n2= len(b)

p1_hat = np.sum(w.call==1)/len(w)
p2_hat = np.sum(b.call==1)/len(b)

p_hat = np.sum(data.call==1)/len(data)

se = np.sqrt(p_hat * (1 - p_hat) * (1/n1 + 1/n2))

z_score = stats.norm.ppf((1 - alpha/2))

margin_err = z_score * se

con_int = [(p1_hat - p2_hat) - margin_err, (p1_hat - p2_hat) + margin_err]

z_stat = (p1_hat - p2_hat) / se

p_val = 2 * ( 1- stats.norm.cdf(abs(z_stat)))

print('margin of error is {}'.format(margin_err))
print('confidence interval is {}'.format(con_int))
print('z-statistic is {}'.format(z_stat))
print('p-value is {}'.format(p_val))

margin of error is 0.015281631502169232
confidence interval is [0.016751222707276352, 0.047314485711614819]
z-statistic is 4.108412152434346
p-value is 3.983886837577444e-05


p-value is so small that we can reject the null hypothesis, thus difference between white and black-sounding receiving callbakcs is statistically significant.

<div class="span5 alert alert-success">
<p> Last, I will use permutation sampling to test if the observed difference in proportions is just by chance, namely receiving callbacks is totally independent of their races. To test this null hypothesis, permute the labels of white and black-sounding names and then arbitrarily divide them into white and black-sounding names. For each permutation, compute the difference in proportion and assess how many of my permutation replicates have an extreme proportion difference than the observed one. </p>
</div>

In [10]:
np.random.seed(42)

# Observed difference in proportion for the two groups
diff_ob = p1_hat - p2_hat

# Permutation sampling
w_b_data = np.concatenate((w.call, b.call))
perm_reps = np.empty(10000)
for i in range(10000):
    permuted_data = np.random.permutation(w_b_data)
    
    perm_w = permuted_data[:len(w)]
    perm_b = permuted_data[len(w):]
    
    perm_w_frac = np.sum(perm_w)/len(perm_w)
    perm_b_frac = np.sum(perm_b)/len(perm_b)
    perm_reps[i] = perm_w_frac - perm_b_frac

p = np.sum(abs(perm_reps) >= abs(diff_ob))/len(perm_reps)
print('p-value = ', p)
    

p-value =  0.0001


From the permutation sampling result, we can see whether receiving callbacks or not is not totally independent of race. 

<div class="span5 alert alert-success">
<p> In this case, researchers are trying to examine the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers. After conducting three tests, we get the conclusion that race/name does have an effect on callbacks for interviews from employers. However, just based on these three tests, we cannot say that race/name is the most important factor in callback success. Because we are not controlling other relevant variables in our tests which may lead to over or under estimate the impace of race/name on callback. In order to get a better analysis, we should include other relevant factors such as education, experiences and skills.  </p>
</div>