
### Examining racial discrimination in the US job market

#### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

#### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes.

#### Exercise
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Discuss statistical significance.

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [14]:
# number of callbacks for balck-sounding names
p1n1 = sum(data[data.race=='b'].call)
p1n1

157.0

In [9]:
n1 = len(data[data.race=='b'])
n1

2435

In [10]:
n2 = len(data[data.race=='w'])
n2

2435

In [12]:
p2n2 = sum(data[data.race=='w'].call)
p2n2

235.0

###   1. What test is appropriate for this problem? Does CLT apply?
A test for the difference of two proportions is appropriate. The sampling distribution of the difference p1_est - p2_est needs to be normal, which occurs when both samples are separately normal. The sample distribution is also likely to be normal when the success-failure condition is met.


###   2. What are the null and alternate hypotheses?
H0: The proportion of callbacks to resumes with black sounding names is the same as the proportion of callbacks
    to resumes with white sounding names.

HA: The proportion of callbacks to resumes with black sounding names is lower than the proportion of callbacks to resumes with white sounding names.

I am assuming a one-side test here, since noone is claiming that blacks are getting preferentially treated.


###   3. Compute margin of error, confidence interval, and p-value.

In [15]:
#Assuming null hypothesis
p_est = (p1n1 + p2n2)/(n1+n2)
p_est

0.080492813141683772

In [16]:
SE = np.sqrt(p_est*(1-p_est)/n1 + p_est*(1-p_est)/n2)
SE

0.0077968940361704568

In [26]:
z = stats.norm.ppf(0.975)
z

1.959963984540054

In [27]:
Margin_of_Error = z*SE
Margin_of_Error

0.015281631502169232

In [28]:
point_estimate = p1n1/n1 - p2n2/n2
point_estimate

-0.032032854209445585

In [29]:
CI = [point_estimate-Margin_of_Error,point_estimate+Margin_of_Error]
print('Confidence Interval =', CI)

Confidence Interval = [-0.047314485711614819, -0.016751222707276352]


Confidence Interval with 95% is between -4.73% and -1.68% 

#### P-Value

In [30]:
Z = (point_estimate - 0)/SE
Z

-4.1084121524343464

In [31]:
stats.norm.cdf(Z)

1.9919434187925383e-05

p < 0.001
There's definitely a difference.

###    4. Discuss statistical significance.

Statistical significance means that there's definitely a differnce, but that's not the same as clinical significane. Maybe blacks definitely get fewer call backs, but that difference is small. 

In this case, whites get ~9.5 callbacks out of hundred resumes and blacks get ~6.5 callbacks. This is not so small.

In [37]:
p1n1/n1

0.064476386036960986

In [38]:
p2n2/n2

0.096509240246406572