
### Examining racial discrimination in the US job market

#### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

#### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes.

#### Exercise
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Discuss statistical significance.

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

In [1]:
import pandas as pd
import numpy as np
from scipy import stats as st

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)

157.0

In [4]:
sum(data.call)

392.0

In [5]:
sum(data[data.race=='w'].call)

235.0

**1. What test is appropriate for this problem? Does CLT apply?**

We can use a hypothesis test to compare the callback proportions of white-sounding and black-sounding names to see if there is a difference. 

Central Limit Theorem states that the distribution of the sample means is approximately normal irrespective of whether the original data is normally distributed, assuming that the sample size is large enough (at least 30). Since we have over 30 rows in the sample, CLT applies in this case.

**2. What are the null and alternate hypotheses?**

According to the null hypothesis, race does not have a significant impact on the rate of callback for resumes. In other words, there is no significant difference in the means of callback proportions of black-sounding and white-sounding names. Our alternate hypothesis is that there is a difference in callback proportions (or race has an effect).

**3. Compute margin of error, confidence interval, and p-value.**

We can think of black-sounding and white-sounding names as two Bernoulli distributions.

In [6]:
df_b = data[data.race=='b'] # Dataframe with only black-sounding names
total_b = len(df_b) # Total number of black-sounding names
df_bcb = sum(df_b.call) # Number of black-sounding names that were called back

In [7]:
df_w = data[data.race=='w'] # Dataframe with only white-sounding names
total_w = len(df_w) # Total number of white-sounding names
df_wcb = sum(df_w.call) # Number of black-sounding names that were called back

In [8]:
# Find sample means
p_bcb = df_bcb/total_b # Sample mean of black-sounding names called back
p_wcb = df_wcb/total_w # Sample mean of white-sounding names called back

In [9]:
# Find standard deviation of the difference in sample proportions of 'b' and 'w' names called back.
std_diff = np.sqrt((p_bcb * (1-p_bcb)/total_b) + (p_wcb * (1-p_wcb)/total_w))

In [10]:
std_diff

0.0077833705866767544

In [11]:
# Find the difference between sample means of 'b' and 'w' names called back.
# This is the mean of the distribution of P1-P2 (or Pw-Pb)
mu_diff = p_wcb - p_bcb

In [12]:
mu_diff

0.032032854209445585

A margin of error with a 95% confidence interval is approximately 2 standard deviations away from the mean. To be more precise, it is 1.96 standard deviations away, but we can use 2 here.

In [13]:
# Find margin of error
me = 2 * std_diff

In [14]:
me

0.015566741173353509

So, the margin of error is 0.015577. The 95% confidence interval is the mean +- margin of error.

In [15]:
mu_upper = mu_diff + me
mu_lower = mu_diff - me

In [16]:
mu_upper

0.047599595382799093

In [17]:
mu_lower

0.016466113036092078

So, we are 95% certain that the true mean of the difference in callbacks between white and black-sounding names is within the range of 0.0165 and 0.0476.

**4. Discuss statistical significance.**

In [18]:
# We can find the p-value from the sample mean difference and 
# the standard deviation of the difference between sample distributions of 'b' and 'w' callbacks.
p_val = st.norm.cdf(0,mu_diff,std_diff)
p_val

1.9312826037613112e-05

We notice that the p value is way below 0.05 (5% significance level). This suggests that there is a less than 5% chance for us to get such a difference in mean values of 'b' and 'w' names called back.

So, we can safely reject the null hypothesis in favour of the alternate hypothesis that race has a significant impact on the rate of callbacks of resumes.