# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.


### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution



In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
print(sum(data[data.race=='b'].call)/len(data[data.race=='b']))
sum(data[data.race=='w'].call)/len(data[data.race=='w'])

0.06447638603696099


0.09650924024640657

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


<div class="span5 alert alert-success">
<p>1. What test is appropriate for this problem? Does CLT apply?</p>
        
        This is a Bernoilli problem because it is a binary question of callback or no callback. We will use Two Sample T Test to compare the two samples. 
    
<p>1b. What are the null and alternate hypotheses?</p>
    H0: No Difference between White and Black Callback
    Ha: Difference between White and Black Callback
    
<p>2. Does CLT apply? </p>
    Sample Size > 30 
    
</div>

In [5]:
total_w = len(data[data.race=='w'])
total_b = len(data[data.race=='b'])

#Percentage Callback for each Race
b_callbackrate = sum(data[data.race=='b'].call)/len(data[data.race=='b'])
w_callbackrate = sum(data[data.race=='w'].call)/len(data[data.race=='w'])

print("Black Callback Rate: ", b_callbackrate)
print("White Callback Rate: ", w_callbackrate)


Black Callback Rate:  0.06447638603696099
White Callback Rate:  0.09650924024640657


# Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.

In [6]:
#Gerenate Variables for Statistical Analysis Purposes

difference = w_callbackrate - b_callbackrate
b_var = (b_callbackrate*(1-b_callbackrate))/total_b
w_var = (w_callbackrate*(1-w_callbackrate))/total_w
bw_var = b_var + w_var
bw_stderr = bw_var**.5
bw_stderr

0.0077833705866767544

In [7]:
difference

0.032032854209445585

In [8]:
b_var**.5

0.004977121442811946

In [9]:
# Q3. Compute margin of error, confidence interval, and p-value.

In [15]:
#Margin of Error using 95% confidence interval

moe = 1.96*bw_stderr
confidenceint = difference + np.array([-1,1]) *moe
print("Confidence Interval is from ",round(confidenceint[0],3), "to ",round(confidenceint[1],3))

Confidence Interval is from  0.017 to  0.047


In [12]:
#Degrees of Freedom

B1=b_var/total_b
W1=w_var/total_w

degfree=((B1+W1)**2)/(((B1**2)/total_b)+((W1**2)/total_w))
degfree


4713.53819343226

In [16]:
t_value = ((difference)-0)/bw_stderr
t_value
p_value = stats.t.sf(np.abs(t_value),degfree)*2

if p_value < .05:
    print("Reject H0 and accept Ha ")
    print("P Value: ", p_value)
else:
    print("Fail to reject H0")

Reject H0 and accept Ha 
P Value:  3.9285451158654165e-05


<div class="span5 alert alert-success">
<p> Your answers to Q4 and Q5 here </p>
</div>

# Q4. Write a story describing the statistical significance in the context or the original problem.

Due to the hypothesis test accepted Ha at a 95% interval, we can say that there is a significant difference in the percent callback between White-sounding names and Black-sounding names". The confidence interval showed that we can be 95% certain that the % difference will lie between 1.7% and 4.7%. With a standard error of .0077, we are still 2.2 Standard Errors away from 0%. This can further prove that their maybe extreme biases in % of callback.

# Q5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

No, this analysis does not mean that race/name is the most important factor in callback success because we are testing 1 factor. This test only proves that it can become a factor. Correlation analysis can determine the importance of each feature. 