# Examining Racial Discrimination in the US Job Market

### Background
Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés to black-sounding or white-sounding names and observing the impact on requests for interviews from employers.

### Data
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.

Note that the 'b' and 'w' values in race are assigned randomly to the resumes when presented to the employer.


### Exercises
You will perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.

Answer the following questions **in this notebook below and submit to your Github account**. 

   1. What test is appropriate for this problem? Does CLT apply?
   2. What are the null and alternate hypotheses?
   3. Compute margin of error, confidence interval, and p-value.
   4. Write a story describing the statistical significance in the context or the original problem.
   5. Does your analysis mean that race/name is the most important factor in callback success? Why or why not? If not, how would you amend your analysis?

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


#### Resources
+ Experiment information and data source: http://www.povertyactionlab.org/evaluation/discrimination-job-market-united-states
+ Scipy statistical methods: http://docs.scipy.org/doc/scipy/reference/stats.html 
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet


In [1]:
import pandas as pd
import numpy as np
from scipy import stats

In [2]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')

In [3]:
# number of callbacks for black-sounding names
sum(data[data.race=='b'].call)

157.0

In [4]:
data.head()

Unnamed: 0,id,ad,education,ofjobs,yearsexp,honors,volunteer,military,empholes,occupspecific,...,compreq,orgreq,manuf,transcom,bankreal,trade,busservice,othservice,missind,ownership
0,b,1,4,2,6,0,0,0,1,17,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,b,1,3,3,6,0,1,1,0,316,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,b,1,4,1,6,0,0,0,0,19,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
3,b,1,3,4,6,0,1,0,1,313,...,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,
4,b,1,3,3,22,0,0,0,0,313,...,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,Nonprofit


In [6]:
len(data)

4870

In [7]:
# number of callbacks for white-sounding names
sum(data[data.race=='w'].call)

235.0

There is a difference of 78 callbacks between the two categories. However, we should look at how many applicants there were from each group.

In [9]:
print(len(data[data.race=='w']))
print(len(data[data.race=='b']))

2435
2435


## Chi-Square Test
Alright, so there are equal sample sizes of black and white employees. Given there are equal sample sizes, the data is categorical binary (yes/no output), a Chi-Square test should be used to compare the two proportions and test the null hypothesis that the proportion of success in finding a job between the two groups is equal. The alternative will be they are not equal.

H0: Proportion of success in finding a job is equal between the two samples is equal.

H1: Proportion of success is not equal.

In [57]:
# Create the contingency table
race_call = pd.crosstab(index=data["race"], 
                           columns=data["call"])
race_call.index= ["black","white"]
race_call

call,0.0,1.0
black,2278,157
white,2200,235


In [58]:
chi2, p, dof, ex = stats.chi2_contingency(race_call)
print(chi2)
print(p)

16.4490285842
4.99757838996e-05


Alright, looks like there is a p-value of about 5x10^-5, which should be sufficient to assume that there is a difference in the proportion of white sounding names receiving a call-back versus black sounding names. The down-side with the chi-square is the inability to measure the effect size. While I can calculate the phi value, which is defined as the square root of the chi-square statistic divided by the sample size, to assess the effect size, a more straight-forward approach would be to use the odds ratio to measure the effect size.

In [65]:
# Odds that a resume with a white sounding name receives a call-back
w_odds = 235/2200
print('white name odds =', w_odds)

# Odds that a resume with a black sounding name receies a call-back
b_odds = 157/2278
print('black name odds =', b_odds)

print('odds ratio = ', w_odds/b_odds)

white name odds = 0.10681818181818181
black name odds = 0.06892010535557506
odds ratio =  1.5498841922408801


So we can say that a white sounding name is about 1.5 times more likely to receive a call-back from an employer. Now, let's find the CI and SE of the odds ratio for the increased rate of call-back for white-sounding names compared to black-sounding ones.

SE of ln(OR) = sqrt((1/p_w) + (1/p_b) + (1/(1-p_w)) + (1/(1-p_b)))

95% CI = exp(ln(OR) +- 1.96*SE)

Where ln(OR) is the log-odds ratio, and the +- means 'plus or minus'.

In [71]:
OR = w_odds/b_odds
# SE
SE = np.sqrt((1/race_call.iloc[0,0]) + (1/race_call.iloc[0,1]) + (1/race_call.iloc[1,0]) + (1/race_call.iloc[1,1]))
print("SE:", SE)

# 95% CI
CI_Lower = np.exp(np.log(OR) - 1.96*SE)
CI_Upper = np.exp(np.log(OR) + 1.96*SE)
print('CI:', CI_Lower, CI_Upper)

SE: 0.107323217049
CI: 1.2558676748 1.91273416583


### With Regards to Race Name in Determining Success in Call-Backs
A chi-square test was performed only on a contingency table of the isolated black and white names, and if they received a call-back. With these two variables isolated, we could see that there was a difference between the two groups. To quantify that difference, the odds ratio was calculated, as well as its 95% confidence interval. While the lower bound indicated that a white-sounding name should be more successful by a factor of 1.25, there is still room for doubt. 

While odds ratios are great at showing the effect size of a proportional representation like this one, it does not take into account confounding factors. There could be a variable within the dataframe that is highly correlated with white-sounding names that happened to increase the rate of call-backs. To pursue this end, I would recommend that future analysis construct a correlation table to find the variables that are highly correlated and investigate whether that confounding factor made a significant difference in increasing the number of call-backs.