# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution

In [37]:
import pandas as pd
import numpy as np
from scipy import stats

In [38]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [39]:
# number of callbacks for black-sounding names
sum(data[data.race=='w'].call)

235.0

In [40]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


<div class="span5 alert alert-success">
<p>Your answers to Q1 and Q2 here</p>
</div>

Q1. The appropriate test is a hypothesis test. Since we are dealing with a population proportion and the sample size is greater than 30, we will use a z statistic in the frequentist test.

The CLT applies because the data is random, independent, and the sample size is less than 10% of the total population

Q2. The null hypothesis: White and Black sounding names get the same rate of call backs from employers.

The alternate hypothesis: White sounding names get more call backs than Black sounding names from employers.

In [41]:
w = data[data.race=='w']
b = data[data.race=='b']

In [42]:
# Your solution to Q3 here

# Start with bootstrapping. Using a two-sample bootstrap hypothesis test
# define draw functions
def bootstrap_replicate_1d(data, func):
    return func(np.random.choice(data, size=len(data)))

def draw_bs_reps(data, func, size=1):
    """Draw bootstrap replicates."""

    # Initialize array of replicates: bs_replicates
    bs_replicates = np.empty(size)

    # Generate replicates
    for i in range(size):
        bs_replicates[i] = bootstrap_replicate_1d(data, func)

    return bs_replicates

wcb = np.array(w.call)
bcb = np.array(b.call)
cb_concat = np.concatenate((wcb, bcb))
cb_concat_mean = np.mean(cb_concat)

#shift wcb and bcb to have the same mean (null hypothesis)
wcb_shift = wcb - np.mean(wcb) + cb_concat_mean
bcb_shift = bcb - np.mean(bcb) + cb_concat_mean

bs_reps_w = draw_bs_reps(wcb_shift, np.mean, 10000)
bs_reps_b = draw_bs_reps(bcb_shift, np.mean, 10000)

bs_reps = bs_reps_w - bs_reps_b

p = np.sum(bs_reps >= (np.mean(wcb) - np.mean(bcb))) / len(bs_reps)
conf_int = np.percentile(bs_reps,[0.5, 99.5])

print('p-value =', p)
print('Margin of error at 99% confidence of difference of call back rate =', round(conf_int[0],4), 'to', round(conf_int[1], 4))


p-value = 0.0
Margin of error at 99% confidence of difference of call back rate = -0.0205 to 0.0193


In [43]:
# Now do frequentist
# Null hypothesis is that proportion of white call back (pw) is equal to proportion of black call back (pb)

pw = np.mean(wcb)
pb = np.mean(bcb)
ppop = np.mean(cb_concat)

# assumed std if null hypothesis is correct
assumed_std = np.sqrt((2*ppop)*(1-ppop)/len(cb_concat))
zscore = (pw - pb) / assumed_std
p_value = stats.norm.sf(abs(zscore))
print('p-value =', p_value)
print('if null is true, then margin of error at 99% confidence in difference of call back rate =',
      round(-2.58*assumed_std, 4), 'to', round(+2.58*assumed_std, 4))
print('actual difference in call back rate =', pw-pb)

p-value = 3.120431183544136e-09
if null is true, then margin of error at 99% confidence in difference of call back rate = -0.0142 to 0.0142
actual difference in call back rate = 0.032032855


<div class="span5 alert alert-success">
<p> Your answers to Q4 and Q5 here </p>
</div>

Q4.  
Both bootstrapping and frequentist hypothesis testing shows a p-value of zero or close to zero, indicating the null hypothesis, that both white and black sounding names on resumes should get the same call back rate from employers, should be rejected. At the 99% confidence interval, the difference in call back rates should range from -0.0197 to 0.0209 (bootstrap) or -0.0142 to 0.0142 (frequentist). The actual observed differnce in call back rate is 0.0320, which is far outside the 99% confidence interval. This suggests that having a black sounding name on a resume is statistically significant to the detriment versus a white sounding name when expecting call backs from employers.

Q5.  
This does not mean black/white sounding names is the most important factor in receiving a call back from employers. From the dataset, there are numerous other collumns that can be tested for significance. This test can compare the significance of the other collumns to determine how important each factor is. Also, the analysis should look at confounding factors of each variable.