# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution

In [20]:
import pandas as pd
import numpy as np
from scipy import stats

In [21]:
data = pd.io.stata.read_stata('/Users/ming/Downloads/EDA_racial_discrimination/data/us_job_market_discrimination.dta')

In [22]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [23]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


<div class="span5 alert alert-success">
<p>Your answers to Q1 and Q2 here</p>
</div>

Q1: What test is appropriate for this problem? Does CLT apply?
What are the null and alternate hypotheses?

In [126]:
w = data[data.race=='w']
b = data[data.race=='b']
print('No. of observations for white-sounding names:', len(w))
print('No. of observations for black-sounding names:', len(b))

No. of observations for white-sounding names: 2435
No. of observations for black-sounding names: 2435


This is a Bernoulli distribution type of problem. To establish whether race has a significant impact on the rate of callbacks for resumes, z statistic is appropiate because the sample size (2435 observations per race) is large enough and Central Limit Theorem (CLT) applys.

Q2. What are the null and alternate hypotheses?

H0: proportion difference of callbacks between black-sounding or white-sounding names = 0

H1: proportion difference of callbacks between black-sounding or white-sounding names != 0

Q3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.

In [124]:
# Your solution to Q3 here

# Significant _Level = 0.05
# sample proportion of resume received a call
p_w = sum(w.call==1)/len(w.call)
p_b = sum(b.call==1)/len(b.call)
p_diff = p_w - p_b
print('Sample proportion of callbacks for white-sounding names: %.4f' %p_w)
print('Sample proportion of callbacks for black-sounding names: %.4f' %p_b)
print ('Sample proportion difference:', p_diff)

# sample proportion is approximaetly equal to real proportion because the sample size is large
# sample variance:-
var_w = (p_w * (1-p_w))/len(w.call) 
var_b = (p_b * (1-p_b))/len(b.call) 
# std of sample propotion difference:
std_diff = np.sqrt(var_w + var_b)

# with 95% CI, margin of error, distance from real proportion difference
m_e=1.96*std_diff # z=1.96 for 95% confidence interval
print('Margin of error %.4f' % m_e)
confidence_interval = [p_diff-m_e, p_diff+m_e]
print ('95% Confidence Interval:',  confidence_interval)
# assumed H0 is correct such that p = p_a = p_b
μ_diff = 0 # null-hypothesis of mean 
p = (sum(w.call==1)+ sum(b.call==1))/(len(w.call) +len(b.call))
var_w = (p * (1-p))/len(w.call) 
var_b = (p * (1-p))/len(b.call) 
# std of sample propotion difference:
σ_diff = np.sqrt(var_w + var_b)

z = (p_diff - μ_diff) / σ_diff
pval_z= 2* (1-stats.norm.cdf(abs(z)))
print('z-statistic: ',z, ' p-value: ', pval_z)
print('There is 95% chance that', 'the true difference of proportion is within %.4f of the sample proportions difference %.4f' %(m_e,p_diff),  'such that the true proportion of callbacks for white-sounding names is higher than the black-sounding names by %.4f to %.4f' %(p_diff-m_e, p_diff+m_e))

# To try bootstrap hypothesis test with 10,000 replicates
def bootstrap_replicate(data, func):
    """Generate bootstrap replicate of data."""
    bs_sample = np.random.choice(data, size=len(data))
    return func(bs_sample)

bs_w = np.empty(10000)
bs_b = np.empty(10000)
bs_diff = np.empty(10000)

for i in range(10000):
    bs_w[i] = bootstrap_replicate(w.call, np.mean)
    bs_b[i] = bootstrap_replicate(b.call, np.mean)
    bs_diff[i] = bs_w[i] - bs_b[i]

print ('Bootstrapped mean of proportion difference:', np.mean(bs_diff))

bootstrapped_SE = np.std(bs_diff)
#Margin of error = z-score * Standard error of the sample
m_e = 1.96 * bootstrapped_SE # z=1.96 for 95% confidence interval
print('Bootstrapped Margin of error %.4f' % m_e)

confidence_interval = np.percentile(bs_diff, [2.5, 97.5])
z = (np.mean(bs_diff) - μ_diff) / σ_diff
pval_z= 2* (1-stats.norm.cdf(abs(z)))
print ('Bootstrapped 95% Confidence Interval:',  confidence_interval)
print('z-statistic: ',z, ' p-value: ', pval_z)

Sample proportion of callbacks for white-sounding names: 0.0965
Sample proportion of callbacks for black-sounding names: 0.0645
Sample proportion difference: 0.032032854209445585
Margin of error 0.0153
95% Confidence Interval: [0.016777447859559147, 0.047288260559332024]
z-statistic:  4.108412152434346  p-value:  3.983886837577444e-05
There is 95% chance that the true difference of proportion is within 0.0153 of the sample proportions difference 0.0320 such that the true proportion of callbacks for white-sounding names is higher than the black-sounding names by 0.0168 to 0.0473
Bootstrapped mean of proportion difference: 0.03205388102196157
Bootstrapped Margin of error 0.0151
Bootstrapped 95% Confidence Interval: [0.01683778 0.04722793]
z-statistic:  4.111108971503381  p-value:  3.937632930162138e-05


<div class="span5 alert alert-success">
<p> Your answers to Q4 and Q5 here </p>
</div>

Q4: Write a story describing the statistical significance in the context or the original problem.

By randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers, the level of racial discrimination in the United States job market was examined. With 2435 observations per race, it was found that the proportion of callbacks for white-sounding names (9.65%) is approximately 50% higher than the black-sounding names (6.45%). 

A z statistic test was conducted to estimate the statistic significance of the proportion difference of callbacks between black-sounding or white-sounding names. Null hypothesis, H0 is that the proportion difference equals to zero. Alternative hypothesis H1 is that the proportion difference does not equal to zero. The significant level used is of 0.05. 

The z test results that the p value generated for the sample with and without boostrapping are 3.98e-05 and 3.94e-05 respectively. Both p values are lower than the significant level, so both suggests the null hypothesis should be rejected. In other words, there is statistically significant proportion difference between black-sounding and white-sounding names to receive callbacks from employers.

Besides, there is 95% chance that the true proportion difference is within 0.0153 of the sample proportion difference 0.0320 such that the true proportion of callbacks for white-sounding names is higher than the black-sounding names by 0.0168 to 0.0473.

To conclude, all statistical approaches result that statistically significant difference is present in receving callbacks between black-sounding and white-sounding names.  This indicates racial discrimination may be pervaded the job market, and the white candidates have higher proportion on receiving callbacks than black candidates.


Q5: Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

My analysis does not mean that race/name is the most important factor in callbacks success, it is because it only analysed the relationship between callbacks success and race/name without considering other variables such as education and experience.  In order to find the most important factor in callback success, the correlation between the callback success and other variables shall be estimated.  The statistical significance of the factors with high correlation with the callback success shall be determined. 


Reference:
https://www.khanacademy.org/math/statistics-probability/confidence-intervals-one-sample/estimating-population-proportion/v/margin-of-error-1
https://www.statisticssolutions.com/how-does-margin-of-error-work/
https://www.dummies.com/education/science/biology/the-bootstrap-method-for-standard-errors-and-confidence-intervals/