# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution

In [1]:
%matplotlib inline
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)

157.0

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


<div class="span5 alert alert-success">
<p>Your answers to Q1 and Q2 here</p>
</div>

In [5]:
w = data[data.race=='w']
b = data[data.race=='b']

## Does CLT apply?

In order to apply CLT following conditions to be verified: <br>
    Random sample assignment <br>
    if sampling without replacement n<10% of population
   
Sample size <br>
    There should be atleast 10 success or 10 failures in sample (np >= 10 and n(1-p) >= 10
    
    p= n1p1 + n2p2 / n1 + n2

In [6]:
# Number of CV per race:
w_cv = len(w)
b_cv = len(b)

# Number of calls per Race:
w_calls = sum(w.call)
b_calls = sum(b.call)

# Sample Proportions:
w_samp_p = w_calls/ len(w)
b_samp_p = b_calls/len(b)

print('Total % call for Black :', b_samp_p*100,'\nTotal % call for White:', w_samp_p*100)

# Population Proportion
p_prop = (w_calls + b_calls)/(len(w)+len(b))
print('Population Proportion: ',p_prop)

# Check np, n(1-p)

np_w = p_prop*len(w)
np_b = (1-p_prop)*len(b)


print('np , n(1-p) values :',np_w, np_b)

Total % call for Black : 6.4476386037 
Total % call for White: 9.65092402464
Population Proportion:  0.0804928131417
np , n(1-p) values : 196.0 2239.0


Since the conditions are met we can apply CLT.

# Q2: Null & Alternate Hypotheses

** Null Hypothesis: ** Probability of success(getting a call back) is the same for both white sounding and black sounding names.<br>
                H0 : Pw - Pb = 0 <br>
** Alternate Hypothesis:** Probability of success or getting a call back is not the same for both the races. <br>
                Ha : Pw - Pb not equal to zero

## Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.

In [7]:
# Frequentist Approach:
def ztest_proportions_two_samples(r1, n1, r2, n2, one_sided=False):
    """Returns the z-statistic and p-value for a 2-sample Z-test of proportions"""
    p1 = r1/n1
    p2 = r2/n2
    
    p = (r1+r2)/(n1+n2)
    se = np.sqrt(p*(1-p)*(1/n1+1/n2))
    
    z = (p1-p2)/se
    p = 1-stats.norm.cdf(abs(z))
    p *= 2-one_sided
    return z, p
    

In [8]:
# 95% confidence interval
prop_diff = w_samp_p - b_samp_p
print('Observed difference in propow_samp_prtions: \t {}\n'.format(prop_diff))

z_crit = 1.96
p_hat1 = w_samp_p*(1-w_samp_p)/len(w)
p_hat2 =  b_samp_p*(1-b_samp_p)/len(b)
ci_high = prop_diff + z_crit*(np.sqrt(p_hat1 + p_hat2))
ci_low = prop_diff - z_crit*(np.sqrt(p_hat1 + p_hat2))

z_stat, p_val = ztest_proportions_two_samples(w_calls, len(w), b_calls, len(b))
print('z-stat: \t {}\np-value: \t {}'.format(z_stat, p_val))

print('95% conf int: \t {} - {}'.format(ci_low, ci_high))
moe = (ci_high - ci_low)/2
print('Margin of err: \t +/-{}'.format(moe))

Observed difference in propow_samp_prtions: 	 0.032032854209445585

z-stat: 	 4.108412152434346
p-value: 	 3.983886837577444e-05
95% conf int: 	 0.016777447859559147 - 0.047288260559332024
Margin of err: 	 +/-0.015255406349886438


In [9]:
# Bootstrap
# Construct arrays of data: white-sounding names, black-sounding names
r = np.sum(data.call)
n = len(data)
all_callbacks = np.array([True] * int(r) + [False] * int(n-r))

size = 10000

bs_reps_diff = np.empty(size)

for i in range(size):
    w_bs_replicates = np.sum(np.random.choice(all_callbacks, size=len(w)))
    b_bs_replicates = np.sum(np.random.choice(all_callbacks, size=len(b)))
    
    bs_reps_diff[i] = (w_bs_replicates - b_bs_replicates)/len(b)
    
bs_p_value = np.sum(bs_reps_diff >= p_prop) / len(bs_reps_diff)

bs_ci = np.percentile(bs_reps_diff, [2.5, 97.5])
bs_mean_diff = np.mean(bs_reps_diff)

print('obs diff: {}\n'.format(prop_diff))
print('BOOTSTRAP RESULTS\np-value: {}\n95% conf. int.: {}'.format(bs_p_value, bs_ci))

obs diff: 0.032032854209445585

BOOTSTRAP RESULTS
p-value: 0.0
95% conf. int.: [-0.01519507  0.01519507]


<div class="span5 alert alert-success">
<p> Your answers to Q4 and Q5 here </p>
</div>

Since the p-value is significantly low  below threshold for both Frequentist and Bootstrap approaches,we can reject the null hypothesis in favor of alternate hypothesis that perception of race based on the name does have an effect on applicant receiving a call back. However there might be some other factors like experience which can also be taken in consideration.


## Statistical Siginficance and Analysis:

It has been proven conclusively that the proportion of callbacks received for resumes with white-sounding names is significantly and consistently higher than the proportion of callbacks for resumes with black-sounding names. The evidence for the samples provided show that resumes with white-sounding names are approximately 50% more likely to receive a callback.