
### Examining racial discrimination in the US job market

#### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

#### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes.

#### Exercise
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Discuss statistical significance.

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

**Basic Data Exploration**

In [32]:
# number of callbacks for black-sounding names
b = sum(data[data.race=='b'].call)
b

157.0

In [33]:
# number of callbacks for white-sounding names
w = sum(data[data.race == 'w'].call)
w

235.0

In [36]:
#number of white-sounding applicants
n1 = sum(data.race == 'w')
n1

2435

In [35]:
#number of black-sounding applicants
n2 = sum(data.race == 'b')
n2

2435

** 1. What test is appropriate for this problem? Does CLT apply? **

*CLT applies since the sample size is very large (2,435 for both the white-sounding and black-sound names group).
Identical resumes were assigned to both black and white names and this gives us a random sample for each group.*

**A two-proportion z-test** will allow us to compare the rate of call backs between the black-sounding names group and the white-sounding
names group.

**2. What are the null and alternate hypotheses?**

The **null hypothesis** is that the proportion of call backs for *black-sounding names* and *white-sounding names* is **the same**.


$$H_0 : P_1 - P_2 = 0$$

The **alternate hypothesis** is that the proportion of call backs for *black-sounding names* and *white-sound names* is **different**.

$$H_A: P_1 - P_2 ≠ 0$$

**3. Compute margin of error, confidence interval, and p-value.**

*Let's compute a **95% Confidence Interval** on the difference in population proportions between the rate of call backs between the
white-sounding and black-sounding names.*

*Margin of Error = critical value x Standard Error of statistic*

**critical value** (z_0.975 or z_0.025) = 1.96

$$SE = \sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2}{n_2}}$$

$P_1$ is the proportion of white-sounding names that received call backs from prospective employers.  
$P_2$ is the proportion of black-sounding names that received call backs from prospective employers.

$n_1$ is the total number of white-sounding names randomly assigned to a resume.  
$n_2$ is the total number of black-sounding names randomly assigned to a resume.

In [40]:
p1 = w / n1
print 'p1 =', p1

p1 = 0.0965092402464


In [39]:
p2 = b / n2
print 'p2 =', p2

p2 = 0.064476386037


In [41]:
SE = ((p1 * (1 - p1)) / n1 + (p2 * (1 - p2)) / n2) ** 0.5
print 'The standard error of the difference in population proportions is', SE

The standard error of the difference in population proportions is 0.00778337058668


In [43]:
ME = 1.96 * SE
print 'The Margin of Error for a 95% confidence interval between the difference in population proportions is +/-', ME

The Margin of Error for a 95% confidence interval between the difference in population proportions is +/- 0.0152554063499


In [44]:
prop_diff = p1 - p2
print 'The difference in population proportions estimated by our two samples is', prop_diff

The difference in population proportions estimated by our two samples is 0.0320328542094


In [55]:
lower = prop_diff - ME
upper = prop_diff + ME
print 'The 95% confidence interval for the difference in population proportions lies between', lower, 'and', upper

The 95% confidence interval for the difference in population proportions lies between 0.0167774478596 and 0.0472882605593


**To find a p-value for this difference in population proportions, we must calculate a z-test statistic.**

$$Z = \frac{(p_1 - p_2)}{SE} = \frac{(p_1 - p_2)}{\sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2}{n_2}}}$$

In [50]:
z = prop_diff / SE
print 'The z-test statistic is', z

The z-test statistic is 4.11555043573


In [54]:
pval =  (1 - stats.norm.cdf(z)) * 2
print 'The p-value for the difference in population proportions is', pval

The p-value for the difference in population proportions is 3.86256520752e-05


**4. Discuss statistical significance.**

*We are **95% confident** that the true difference in the proportion of call backs between white-sounding applicants
and black-sounding applicants is between 1.7% and 4.7%.* We can say with repeated sampling, 95% of the interval constructed
this way would contain the true population proportion difference. Since this confidence interval does not include 0, this means
there is  a statistically significant difference in the call back rate between white-sounding and black-sounding applicants.

The 2-proportion z-test yielded a **p-value of 3.86e-5** which shows strong statistical evidence against the null hypothesis. The
interpretation of a p-value is that it is the probability of observing these samples given the null hypothesis is true.
Since this probability is so low, it is likely that the null hypothesis of the call back rates being the same between white-sounding
and black-sounding applicants is not true. This study implies that there is racial bias in the workplace since the resumes were
identical.

It is clear that the call back rates are statistically different in this study, but the *effect size* is not easily measured
by proportions. A better way to quantify the effect size is using **Cohen's H** as shown below.

$Φ_1 = 2arcsin\sqrt{p1}$  
$Φ_2 = 2arcsin\sqrt{p2}$

**h** = $Φ_1 - Φ_2$

In [59]:
phi1 = 2 * np.arcsin(p1 ** 0.5)
phi2 = 2 * np.arcsin(p2 ** 0.5)
h = phi1 - phi2
print "The Cohen's h for the difference in proportions is", h

The Cohen's h for the difference in proportions is 0.118307242719


**A Cohen's h of 0.2 is considered a small effect size.** Since the Cohen's h measured for this study is *approximately 0.12*, this is
a small effect size, but nonetheless, a statistically significant effect.