# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.

### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value. Try using both the bootstrapping and the frequentist statistical approaches.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet
+ Formulas for the Bernoulli distribution: https://en.wikipedia.org/wiki/Bernoulli_distribution

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

<div class="span5 alert alert-success">
<p>Your answers to Q1 and Q2 here</p>
</div>

In [3]:
data.shape

(4870, 65)

The dataset has more than 4K records, the sample size is large enough to assume the data are independent and normally distributed. z test will be used due to large sample size

We will measure the probability of getting a phone call by race, simplified as pb (for black) and pw (for white)
- Null Hypothesis - pb and pw are the same, candidates will receive a phone call independent of their names' sounding
- Alternative Hypothesis - pb < pw, candidates with blacking sounding names have lower chance of receiving a phone call

In [4]:
# number of callbacks for black-sounding names
w_call = sum(data[data.race=='w'].call)
b_call = sum(data[data.race=='b'].call)
ttl_call = sum(data.call)
# Calculate pb & pw
pb = b_call/len(data[data.race=='b'])
pw = w_call/len(data[data.race=='w'])
pdiff = pb-pw
pttl = ttl_call/len(data)

print(pb, pw, pdiff, pttl)

0.06447638603696099 0.09650924024640657 -0.032032854209445585 0.08049281314168377


From the sample, we observed that call rate for black is 6.4%, while 9.6% for white, a 3.2% difference. Total call rate is at ~8%

To prove that the difference observed is not by chance, we will perform bootstrap test and z-test. We first need to generate many replicate of pb & pw, calculating the difference, and comparing to the observed difference to confirm how extreme the value is

#### Bootstrap Method

In [5]:
# Generate 1,000 replicate for the race & call feild
bs_replicate_call = np.array([np.random.choice(data.call, len(data), replace = True) for _ in range(1000)])
bs_replicate_race = np.array([np.random.choice(data.race, len(data), replace = True) for _ in range(1000)])

In [6]:
# Generate the bootstrap replicate by combining the call and race replicate, and calculiate the call rates difference.
# A dataframe will generated for each call and race dataset, which is purely assign randomly
diff = np.empty(1000)
for i in range(1000):
    table = pd.DataFrame({'race': bs_replicate_race[i], 'call': bs_replicate_call[i]}, columns=['race', 'call'])
    pbt = sum(table[table.race=='b'].call)/len(table[table.race=='b'])
    pwt = sum(table[table.race=='w'].call)/len(table[table.race=='w'])
    diff[i] = pbt-pwt

In [7]:
# Calculiating the p value that difference in call rate between black and white to be smaller than the observed difference
p = np.sum(diff<=pdiff)/len(diff)
print(p)

0.0


p value of zero suggest that the call rate difference as observed in the sample is very extreme and we have significant evidence to reject the null hypothesis

#### z test

In [8]:
w = data[data.race=='w']
b = data[data.race=='b']

In [9]:
# Estimating sample population standard deviation
p_mean = (pb+pw)*100/(len(data))
dev = np.sqrt(2*(p_mean)*(1-p_mean)*100/(len(data)))
z_score = (pdiff-0)/dev
print(z_score)

-2.7538168547860984


z score is less than 95% interval z value of -1.96. We have significant evidence to reject the null hypothesis

<div class="span5 alert alert-success">
<p> Your answers to Q4 and Q5 here </p>
</div>

Based on the hypothesis test results, they favor the alternate hypothesis that the sounding of the candidates name which indicate the race information impact the chance of the candidates on getting a call from the recruiters. The sample suggest the difference could be as much as 3%, or 33% difference between 9% (white) and 6% (black).

We, however, cannot conclude that race is the most important factors on the callback rates. There are 65 attribute in total. While the question suggested that the researchers assigned names into identical resumes, actual data show that attributes varies across resumes. It could be that black sounding names are pairing up with lower number of years of experience by accident, which lead to lower call back rates as recruiters were using experience as hiring filter.

In [10]:
data.tail(15)

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
4855,a,96b,4,3,7,0,0,0,1,274,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,
4856,a,96b,4,4,6,0,0,0,0,285,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,
4857,a,96b,4,4,2,0,1,1,0,267,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,
4858,a,98b,4,4,2,0,1,1,0,267,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private
4859,a,98b,4,4,6,0,0,0,0,285,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private
4860,a,98b,4,6,8,0,1,0,0,21,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private
4861,a,98b,4,3,7,0,0,0,1,274,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,Private
4862,b,99,3,5,13,0,0,0,0,27,...,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,Private
4863,b,99,2,4,16,0,0,0,1,27,...,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,Private
4864,b,99,3,5,26,1,1,0,1,313,...,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,Private
